Chapter

Transformer-Based Multi-industry Electricity Demand Forecasting

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The accuracy of electricity demand forecasting is closely related to the correctness of decision-making in the power system, ensuring stable energy supply. Stable energy supply is a necessary guarantee for socioeconomic development and normal human life. Accurate electricity demand forecasting can provide reliable guidance for electricity production and supply dispatch, improve the power system's supply quality, and ultimately enhance the security and cost-effectiveness of power grid operation, which is crucial for boosting economic and social benefits. Currently, research on electricity demand forecasting mainly focuses on the single-factor relationship between power consumption and economic growth, industrial development, etc., while neglecting the study of multiple influencing factors and considering different time dependencies. To address this challenge, we propose a transformer-based forecasting model that utilizes transformer networks and fully connected neural networks (FC) for electricity demand forecasting in different industries within a city. The model employs the encoder part of the transformer to capture the dependencies between different influencing factors and uses FC to capture time dependencies. We evaluate our approach on electricity demand forecasting datasets from multiple cities and industries using various metrics. The experimental results demonstrate that our proposed method outperforms state-of-the-art methods in terms of accuracy and robustness. Overall, we provide a valuable framework in the field of electricity demand forecasting, which holds practical significance for stable power system operations.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
The traditional online ensemble load forecasting model updates the model parameters according to the prediction errors of the sample data in the fixed length window and ignores the fact that different prediction errors in the window have different effects on the parameter updating. Therefore, a novel ensemble load forecasting model based on online error updating is proposed in this paper. First, a Lagrangian-based optimal weight solving model for ensemble prediction is established, and the error influence matrix is introduced into the objective function to quantify the influence difference of the prediction errors at different positions in the window. Then, inspired by Adaboost’s idea of adjusting sample weights, a weight decay factor is designed to dynamically adjust the error influence matrix during window sliding process. The model proposed in this paper is applied to the real power load data of five regions in a city in northern China. The mean absolute percentage error (MAPE) of the proposed model is 5.1%. Compared with the existing online ensemble prediction model, the prediction error is reduced by 0.91% on average.
Article
Full-text available
Highway transportation is an important part of our country traffic transportation system, taking on most of the capacity demand in passenger transport markets. But with the rapid development of domestic rail transportation technology, people's travel mode has gradually changed from highway transportation to rail transportation. This makes the highway passenger transport management department must consider the impact of the change of people's travel mode on the highway passenger transport market demand when making production plans. The market share of highway passenger transport is an important index to measure the development level of highway passenger transport, which can directly reflect the supply-demand relationship between highway passenger transport and passenger transport market. Therefore, this paper selects the market share of highway passenger transport as the prediction evaluation index, and predicts and analyzes the market share data of highway passenger transport in Henan Province from 2010 to 2021 based on the exponential smoothing method. It mainly calculates the single exponential smoothing results under several different smoothing coefficients by using the smoothing analysis tool in EXCEL, and determines the optimal smoothing coefficient by taking the minimum root mean square error (RMSE) as the criterion. Then, it solves the lag defect of the single exponential smoothing prediction results by using the quadratic exponential smoothing. Finally, the trend prediction model of highway passenger transport market share in Henan Province is obtained. The prediction results of this model can guide the rational allocation of highway transportation resources and the formulation of passenger transport production plan in Henan Province.
Article
Full-text available
Nowadays, electricity is a basic commodity necessary for the well-being of any modern society. Due to the growth in electricity consumption in recent years, mainly in large cities, electricity forecasting is key to the management of an efficient, sustainable and safe smart grid for the consumer. In this work, a deep neural network is proposed to address the electricity consumption forecasting in the short-term, namely, a long short-term memory (LSTM) network due to its ability to deal with sequential data such as time-series data. First, the optimal values for certain hyper-parameters have been obtained by a random search and a metaheuristic, called coronavirus optimization algorithm (CVOA), based on the propagation of the SARS-Cov-2 virus. Then, the optimal LSTM has been applied to predict the electricity demand with 4-h forecast horizon. Results using Spanish electricity data during nine years and half measured with 10-min frequency are presented and discussed. Finally, the performance of the proposed LSTM using random search and the LSTM using CVOA is compared, on the one hand, with that of recently published deep neural networks (such as a deep feed-forward neural network optimized with a grid search) and temporal fusion transformers optimized with a sampling algorithm, and, on the other hand, with traditional machine learning techniques, such as a linear regression, decision trees and tree-based ensemble techniques (gradient-boosted trees and random forest), achieving the smallest prediction error below 1.5%.
Article
Full-text available
Several variants of the Long Short-Term Memory (LSTM) architecture for recurrent neural networks have been proposed since its inception in 1995. In recent years, these networks have become the state-of-the-art models for a variety of machine learning problems. This has led to a renewed interest in understanding the role and utility of various computational components of typical LSTM variants. In this paper, we present the first large-scale analysis of eight LSTM variants on three representative tasks: speech recognition, handwriting recognition, and polyphonic music modeling. The hyperparameters of all LSTM variants for each task were optimized separately using random search and their importance was assessed using the powerful fANOVA framework. In total, we summarize the results of 5400 experimental runs (about 15 years of CPU time), which makes our study the largest of its kind on LSTM networks. Our results show that none of the variants can improve upon the standard LSTM architecture significantly, and demonstrate the forget gate and the output activation function to be its most critical components. We further observe that the studied hyperparameters are virtually independent and derive guidelines for their efficient adjustment.
Conference Paper
Full-text available
this paper we present an investigation for the short term (up 24 hours) load forecasting of the demand for the South Sulewesi's (Sulewesi Island - Indonesia) Power System, using a multiple linear regression (MLR) method. After a brief analytical discussion of the technique, the usage of polynomial terms and the steps to compose the MLR model will be explained. Report on implementation of MLR algorithm using commercially available tool such as Microsoft EXCELTM will also be discussed. As a case study, historical data consisting of hourly load demand and temperatures of South Sulawesi electrical system will be used, to forecast the short term load. The results will be presented and analysed potential for improvement using alternative methods is also discussed.
Article
Vision transformers (ViTs) have been trending in image classification tasks due to their promising performance when compared with convolutional neural networks (CNNs). As a result, many researchers have tried to incorporate ViTs in hyperspectral image (HSI) classification tasks. To achieve satisfactory performance, close to that of CNNs, transformers need fewer parameters. ViTs and other similar transformers use an external classification ( CLS ) token, which is randomly initialized and often fails to generalize well, whereas other sources of multimodal datasets, such as light detection and ranging (LiDAR), offer the potential to improve these models by means of a CLS . In this article, we introduce a new multimodal fusion transformer (MFT) network, which comprises a multihead cross-patch attention ( mCrossPA ) for HSI land-cover classification. Our mCrossPA utilizes other sources of complementary information in addition to the HSI in the transformer encoder to achieve better generalization. The concept of tokenization is used to generate CLS and HSI patch tokens, helping to learn a distinctive representation in a reduced and hierarchical feature space. Extensive experiments are carried out on widely used benchmark datasets, i.e., the University of Houston (UH), Trento, University of Southern Mississippi Gulfpark (MUUFL), and Augsburg. We compare the results of the proposed MFT model with other state-of-the-art transformers, classical CNNs, and conventional classifiers models. The superior performance achieved by the proposed model is due to the use of mCrossPA . The source code will be made available publicly at https://github.com/AnkurDeria/MFT .
Article
Image dehazing is a representative low-level vision task that estimates latent haze-free images from hazy images. In recent years, convolutional neural network-based methods have dominated image dehazing. However, vision Transformers, which has recently made a breakthrough in high-level vision tasks, has not brought new dimensions to image dehazing. We start with the popular Swin Transformer and find that several of its key designs are unsuitable for image dehazing. To this end, we propose DehazeFormer, which consists of various improvements, such as the modified normalization layer, activation function, and spatial information aggregation scheme. We train multiple variants of DehazeFormer on various datasets to demonstrate its effectiveness. Specifically, on the most frequently used SOTS indoor set, our small model outperforms FFA-Net with only 25% #Param and 5% computational cost. To the best of our knowledge, our large model is the first method with the PSNR over 40 dB on the SOTS indoor set, dramatically outperforming the previous state-of-the-art methods. We also collect a large-scale realistic remote sensing dehazing dataset for evaluating the method’s capability to remove highly non-homogeneous haze. We share our code and dataset at https://github.com/IDKiro/DehazeFormer .
Article
The existing analytical and numerical simulation models are not able to estimate the crack depth when poor information available about the crack. The present study is aimed to develop an online monitoring system to estimate crack depth in composites. The online monitoring system is developed with optimization Grey model OGM(1,N) and support vector machine (SVM) separately and the crack depth is estimated in E-glass fiber reinforcement polymer composites. In this study, cracks are made artificially on the E-glass fiber reinforcement polymer at distance of 50, 100 and 150 mm from free end with crack depth ratios of 12.9%, 14.1%, 15.3%, 16.5%, 17.6% and 18.8%. Natural frequency is measured at three nodes for all the cracks. The proposed SVM and GN(1,N) models are trained with four samples for each position and tested for the remaining two samples. In the proposed OGM(1,N) model, training samples are updated by adding recent data sample and deleting old data. So that the OGM(1,N) model predicts the crack depth ratio more accurately than the SVM model with an error of 1.06%. The OGM (1, N) is simple and directly estimates the crack depth by taking into account online measured data of the vibration frequency. Based on the accuracy in prediction, the Grey online modeling and monitoring system is suggested to estimated crack depth in composites. Interaction effect of the crack position and crack depth ratio on the natural frequency at the three nodes is studied.
Article
We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions. The method is straightforward to implement and is based an adaptive estimates of lower-order moments of the gradients. The method is computationally efficient, has little memory requirements and is well suited for problems that are large in terms of data and/or parameters. The method is also ap- propriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The method exhibits invariance to diagonal rescaling of the gradients by adapting to the geometry of the objective function. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. We demonstrate that Adam works well in practice when experimentally compared to other stochastic optimization methods.
Article
The rapid development of human population, buildings and technology application currently has caused electric consumption to grow rapidly. Therefore, efficient energy management and forecasting energy consumption for buildings are important in decision-making for effective energy saving and development in particular places. This paper reviews the building electrical energy forecasting method using artificial intelligence (AI) methods such as support vector machine (SVM) and artificial neural networks (ANN). Both methods are widely used in the field of forecasting and their aim on finding the most accurate approach is ever continuing. Besides the already existing single method of forecasting, the hybridization of the two forecasting methods has the potential to be applied for more accurate results. Further research works are currently ongoing, regarding the potential of hybrid method of Group Method of Data Handling (GMDH) and Least Square Support Vector Machine (LSSVM), or known as GLSSVM, to forecast building electrical energy consumption.
Article
The paper demonstrates the use of Box-Jenkins time series analysis in short-term load forecasting, and a forecasting system developed at the Imatra Power Company is described. The forecasting algorithm is simple, fast and accurate, which makes it suitable for online forecasting. The transfer function model is used to introduce temperature effects, thus improving accuracy further. The method gives good results in other forecasting problems of electrical energy systems.
Article
This paper presents a new time series modeling for short term load forecasting, which can model the valuable experiences of the expert operators. This approach can accurately forecast the hourly loads of weekdays, as well as, of weekends and public holidays. It is shown that the proposed method can provide more accurate results than the conventional techniques, such as artificial neural networks or Box-Jenkins models. In addition to hourly loads, daily peak load is an important problem for dispatching centers of a power network. Most of the common load forecasting approaches do not consider this problem. It is shown that the proposed method can exactly forecast the daily peak load of a power system. Obtained results from extensive testing on the Iran's power system network confirm the validity of the developed approach
Article
A review of five widely applied short-term (up to 24 h) load forecasting techniques is presented. These are: multiple linear regression; stochastic time series; general exponential smoothing; state space and Kalman filter; and a knowledge-based approach. A brief discussion of each of these techniques, along with the necessary equations, is presented. Algorithms implementing these forecasting techniques have been programmed and applied to the same database for direct comparison of these different techniques. A comparative summary of the results is presented to give an understanding of the inherent level of difficulty of each of these techniques and their performances
Short-term electric load forecasting based on EEMD-GRU-MLR. Power Syst
  • D Deng
  • J Li
  • Z Zhang
Short-term load forecasting method for integrated energy system based on ALIF-LSTM and multi-task learning
  • O Jing
  • Y Lü
  • Y Kang