Article

Big Data Analytics for Identifying Electricity Theft using Machine Learning Approaches in Micro Grids for Smart Communities

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Electricity Theft (ET) causes major revenue loss in power utilities. It reduces the quality of supply, raises production cost, causes legal consumers to pay the higher cost and impacts the economy as a whole. In this paper, we use the State Grid Cooperation of China (SGCC) dataset, which contains electricity consumption data of 1035 days for two classes: normal and fraudulent. In this work, Electricity Theft Detection (ETD) model is proposed that consists of four steps: interpolation, data balancing, feature extraction and classification. Firstly, missing values of the dataset are recovered using the interpolation method. Secondly, resampling technique is implemented. ET consumers are 9% in the SGCC dataset that make the model inefficient to correctly classify both classes (normal and theft). A hybrid resampling technique is proposed, named Synthetic Minority Oversampling Technique with Near Miss (SMOTE-NM). Thirdly, Residual Network (ResNet) extracts the latent features from the SGCC dataset. Fourthly, three tree based classifiers, such as Decision Tree (DT), Random Forest (RF) and Adaptive Boosting (AdaBoost) are applied to train the encoded feature vectors for classification. Besides, search for good hyperparameters is a challenging task, which is usually done manually and takes a considerable amount of time. To resolve this problem, Bayesian optimizer is used to simplify the tuning process of DT, RF and AdaBoost. Finally, the results indicate that RF outperforms DT and AdaBoost.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Mostly, the aforementioned literature focuses on classifier design or feature engineering algorithms, where conventional classifiers, e.g., SVM and DT algorithms are popular [3], [19]. However, SVM usually has a high computational cost and it is a challenge to find optimal values of hyper-parameters to achieve higher classification results. ...
... Notably, the higher the AUC, the better the classifier's performance. When the AUC tends straight up to the maximum value and then turn towards the x-axis, it indicates that both classes are distinguished perfectly by the classifier [19]. By contrast, when AUC = 0.5 and the curve point tends towards the diagonal line, this yields that the classifier has no power to discriminate between both classes. ...
... INTRODUCTION Electricity theft contributes hugely to the Non-Technical Losses (NTL) incurred by power companies globally [1]- [3]. Electricity theft is a socio-economic problem faced around the globe. ...
... In [3], an electricity theft detection model was proposed. The proposed model involved four steps: interpolation (recovery of missing values in the dataset), data balancing (through resampling), feature extraction using residual network and classification by tree-based classifiers: Decision Tree (DT), Random Forest (RF) and Adaptive Boosting (AdaBoost). ...
Conference Paper
The sophistry evolved by dishonest consumers to thwart consumption measurement systems continues to increase in scope and dimension such that the utilitarian value of power sectors across the world is diminished as a result. This paper proposes a hybrid model for efficient periodicity analysis without incurring the computational overload that is often experienced with the Dynamic Time Warping (DTW) algorithm. Autocorrelation function was used to screen candidate periods from an FFT-generated periodogram to obtain a single, accurate period for use by the DTW algorithm to determine the similarity or otherwise of each of the data instance to its shifted version. The resulting, segregated data, either normal or theft, was then employed to construct machine learning models for electricity theft detection. A Decision tree classifier (DTC), two tree-based ensembles - Random Forest (RFC) and Extremely randomized tree classifiers (ETC) and an SVM model were trained on two separate categories of data - daily and mean weekly consumption data. The results show that the periodicity detection approach was promising. The ETC model produced the highest AUC of 95 % and 98 % respectively on the daily and mean weekly data.
... Similarly, FP and FN score means that the numbers of honest and dishonest consumers are misclassified. It is pertinent to mention that the class for the honest consumption pattern is assigned the class label 0 and that for the dishonest consumption pattern is assigned the class label 1 [14]. Based on CM results, the following Eqs. ...
... The area under the ROC curve is termed as Area Under the Curve (AUC). The AUC curve differentiates the distribution of fair class from fraudulent class and is expressed as follows in Eq. (14): ...
Conference Paper
Full-text available
In Smart Grids (SG), Electricity Theft Detection (ETD) is of great importance because it makes the SG cost efficient. Existing methods for ETD cannot efficiently handle data imbalance, missing values, variance and non-linear data problems in the smart meter data. Therefore, an effective integrated strategy is required to address underlying issues and accurately detect electricity theft using big data. In this work, a simple yet effective approach is proposed by integrating two different modules, such as data pre-processing and classification, in a single framework. The first module involves data imputation, outliers handling, standardization and class balancing steps to generate quality data for classifier training. The second module classifies honest and dishonest users with a Support Vector Machine (SVM) classifier. To improve the classifier’s learning trend and accuracy, a Bayesian optimization algorithm is used to tune SVM’s hyperparameters. Simulation results confirm that the proposed framework for ETD significantly outperforms previous machine learning approaches such as random forest, logistic regression and SVM in terms of accuracy.
... In existing supervised learning algorithms, the usage of SVM and Logistic Regression (LR) has become an active area of research in ETD. However, they require manual feature extraction that relies on expert knowledge and does not perform data preprocessing [2], [3], [5], [26], [30]. In [3], the authors propose a wide and deep model to analyze electricity theft data. ...
... In the literature, hybrid deep learning techniques are mostly used for ETD, in which CNN, LSTM, and Random Forest (RF) models are of vital importance [9], [10], [26], [29]. Moreover, raw datasets are used as inputs for training and testing, which degrades the models' classification performance. ...
Conference Paper
Full-text available
In this paper, a data driven based solution is proposed to detect Non-Technical Losses (NTLs) in the smart grids. In the real world, the number of theft samples are less as compared to the benign samples, which leads to data imbalance issue. To resolve the issue, diverse theft attacks are applied on the benign samples to generate synthetic theft samples for data balancing and to mimic real-world theft patterns. Furthermore, several non-malicious factors influence the users' energy usage patterns such as consumers' behavior during weekends, seasonal change and family structure, etc. The factors adversely affect the model's performance resulting in data misclassification. So, non-malicious factors along with smart meters' data need to be considered to enhance the theft detection accuracy. Keeping this in view, a hybrid Multi-Layer Perceptron and Gated Recurrent Unit (MLP-GRU) based Deep Neural Network (DNN) is proposed to detect electricity theft. The MLP model takes auxiliary data such as geographical information as input while the dataset of smart meters is provided as an input to the GRU model. Due to the improved generalization capability of MLP with reduced overfitting and effective gated configuration of multi-layered GRU, the proposed model proves to be an ideal solution in terms of prediction accuracy and computational time. Furthermore, the proposed model is compared with the existing MLP-LSTM model and the simulations are performed. The results show that MLP-GRU achieves 0.87 and 0.89 score for Area under the Receiver Operating Characteristic Curve and Area under the Precision-Recall Curve (PR-AUC), respectively as compared to 0.72 and 0.47 for MLP-LSTM.
... Study [13] proposes an electricity theft detection method consisting of four steps: missing value interpolation, data balancing, feature extraction, and fraudulent behavior classification. The dataset used in this study was already separated into a normal and a fraudulent class, meaning there is no need for anomaly and theft injection. ...
Article
Smart grid gives more control and information to the utility companies. However, it can be leveraged for data manipulation, which can lead to new techniques in electricity theft. This paper presents an electricity theft detection framework, designed for handling real-time large-scale smart grid data to address these new emerging threats. It uses a hybrid approach, combining the information inferred by analyzing the reported data from distribution transformer meters with machine learning algorithms to discover fraudulent activity. We added an additional form of attack to the six previously known patterns and generated malicious variants of consumption data to solve the problem of imbalanced dataset classes, resulting in more accurate classifiers. The framework also allows for a trade-off between the detection rate and triggered false alarms by using a sliding window in the decision-making process. In the end, the proposed framework is evaluated using well-known clustering and classification methods in a practical scenario, resulting in outcomes superior or equal to the previously achieved scores while having the advantages of online and distributed processing.
... The performance is determined from the CM, i.e., the matrix that is used to explain distinct outcomes in classification problems, as alluded to earlier and shown in Fig. 4. In binary classification tasks, the 0 class label is dedicated for honest consumers and that for dishonest consumers, the class label 1 is assigned [36]. Here, TP (1,1) and TN (0,0) scores mean that normal and abnormal consumption patterns are identified accurately. ...
Article
Full-text available
The role of electricity theft detection (ETD) is critical to maintain cost-efficiency in smart grids. However, existing methods for theft detection can struggle to handle large electricity consumption datasets because of missing values, data variance and nonlinear data relationship problems, and there is a lack of integrated infrastructure for coordinating electricity load data analysis procedures. To help address these problems, a simple yet effective ETD model is developed. Three modules are combined into the proposed model. The first module deploys a combination of data imputation, outlier handling, normalization and class balancing algorithms, to enhance the time series characteristics and generate better quality data for improved training and learning by the classifiers. Three different machine learning (ML) methods, which are uncorrelated and skillful on the problem in different ways, are employed as the base learning model. Finally, a recently developed deep learning approach, namely a temporal convolutional network (TCN), is used to ensemble the outputs of the ML algorithms for improved classification accuracy. Experimental results confirm that the proposed framework yields a highly-accurate, robust classification performance, in comparison to other well-established machine and deep learning models and thus can be a practical tool for electricity theft detection in industrial applications.
... In [74][75][76][77][78], there are various consumption behaviours of different users. Consumption behaviours of each customer give different results. ...
Research Proposal
Full-text available
In this synopsis, the first solution introduces a hybrid deep learning model, which tackles the class imbalance problem and curse of dimensionality and low detection rate of existing models. The proposed model integrates benefits of both GoogLeNet and gated recurrent unit. The one dimensional EC data is fed into GRU to remember periodic patterns. Whereas, GoogLeNet model is leveraged to extract latent features from the two dimensional weekly stacked EC data. Furthermore , the time least square generative adversarial network is proposed to solve the class imbalance problem. The second solution presents a framework, which is employed to solve the curse of dimensionality issue. In literature, the existing studies are mostly concerned with tuning the hyperparameters of ML/ DL methods for efficient detection of NTL. Some of them focus on the selection of prominent features from data to improve the performance of electricity theft detection. However, the curse of dimensionality affects the generalization ability of ML/ DL classifiers and leads to computational, storage and overfitting problems. Therefore, to deal with above-mentioned issues, this study proposes a system based on metaheuristic techniques (artificial bee colony and genetic algorithm) and denoising autoencoder for electricity theft detection using big data in electric power systems. The third solution introduces a hybrid deep learning model for prediction of upwards and downwards trends in financial market data. The financial market exhibits complex and volatile behavior that is difficult to predict using conventional machine learning (ML) and statistical methods, as well as shallow neural networks. Its behavior depends on many factors such as political upheavals , investor sentiment, interest rates, government policies, natural disasters, etc. However, it is possible to predict upward and downward trends in financial market behavior using complex DL models. In this synopsis, we have proposed three solutions to solve different issues in smart grids and financial market. The validations of proposed solutions will be done in thesis work using real-world datasets.
... In [74,75,76,77,78], there are various consumption behaviours of different users. Consumption behaviours of each customer give different results. ...
Thesis
Full-text available
Data science is an emerging field, which has applications in multiple disciplines; like healthcare, advanced image recognition, airline route planning, augmented reality, targeted advertising, etc. In this thesis, we have exploited its applications in smart grids and financial markets with three major contributions. In the first two contributions, machine learning (ML) and deep learning (DL) models are utilized to detect anomalies in electricity consumption (EC) data, while in third contribution, upwards and downwards trends in the financial markets are predicted to give benefits to the potential investors. Non-technical losses (NTLs) are one of the major causes of revenue losses for electric utilities. In the literature, various ML and DL approaches are employed to detect NTLs. The first solution introduces a hybrid DL model, which tackles the class imbalance problem and curse of dimensionality and low detection rate of existing models. The proposed model integrates benefits of both GoogLeNet and gated recurrent unit (GRU). The one dimensional EC data is fed into GRU to remember periodic patterns. Whereas, GoogLeNet model is leveraged to extract latent features from the two dimensional weekly stacked EC data. Furthermore, the time least square generative adversarial network (TLSGAN) is proposed to solve the class imbalance problem. The TLSGAN uses unsupervised and supervised loss functions to generate fake theft samples, which have high resemblance with real world theft samples. The standard generative adversarial network only updates the weights of those points that are available at the wrong side of the decision boundary. Whereas, TLSGAN even modifies the weights of those points that are available at the correct side of decision boundary, which prevent the model from vanishing gradient problem. Moreover, dropout and batch normalization layers are utilized to enhance model’s convergence speed and generalization ability. The proposed model is compared with different state-of-the-art classifiers including multilayer perceptron (MLP), support vector machine, naive bayes, logistic regression, MLP-long short term memory network and wide and deep convolutional neural network. The second solution presents a framework, which is employed to solve the curse of dimensionality issue. In literature, the existing studies are mostly concerned with tuning the hyperparameters of ML/ DL methods for efficient detection of NTL, i.e., electricity theft detection. Some of them focus on the selection of prominent features from data to improve the performance of electricity theft detection. However, the curse of dimensionality affects the generalization ability of ML/ DL classifiers and leads to computational, storage and overfitting problems. Therefore, to deal with above-mentioned issues, this study proposes a system based on metaheuristic techniques (artificial bee colony and genetic algorithm) and denoising autoencoder for electricity theft detecton using big data in electric power systems. The former (metaheuristics) are used to select prominent features. While the latter are utilized to extract high variance features from electricity consumption data. First, new features are synthesized from statistical and electrical parameters from the user’s consumption history. Then, the synthesized features are used as input to metaheuristic techniques to find a subset of optimal features. Finally, the optimal features are fed as input to the denoising autoencoder to extract features with high variance. The ability of both techniques to select and extract features is measured using a support vector machine. The proposed system reduces the overfitting, storage and computational overhead of ML classifiers. Moreover, we perform several experiments to verify the effectiveness of our proposed system and results reveal that the proposed system has higher performance our counterparts. The third solution introduces a hybrid DL model for prediction of upwards and downwards trends in financial market data. The financial market exhibits complex and volatile behavior that is difficult to predict using conventional ML and statistical methods, as well as shallow neural networks. Its behavior depends on many factors such as political upheavals, investor sentiment, interest rates, government policies, natural disasters, etc. However, it is possible to predict upward and downward trends in financial market behavior using complex DL models. This paper therefore addresses the following limitations that adversely affect the performance of existing ML and DL models, i.e., the curse of dimensionality, the low accuracy of the standalone models, and the inability to learn complex patterns from high-frequency time series data. The denoising autoencoder is used to reduce the high dimensionality of the data, overcoming the problem of overfitting and reducing the training time of the ML and DL models. Moreover, a hybrid DL model HRG is proposed based on a ResNet module and gated recurrent units. The former is used to extract latent or abstract patterns that are not visible to the human eye, while the latter retrieves temporal patterns from the financial market dataset. Thus, HRG integrates the advantages of both models. It is evaluated on real-world financial market datasets obtained from IBM, APPL, BA and WMT . Also, various performance indicators such as f1-score, accuracy, precision, recall, receiver operating characteristic-area under the curve (ROC-AUC) are used to check the performance of the proposed and benchmark models. The RG 2 achieves 0.95, 0.90, 0.82 and 0.80 ROC-AUC values on APPL, IBM, BA and WMT datasets respectively, which are higher than the ROC-AUC values of all implemented ML and DL models.
... Still, they face a number of issues, such as lack of increased security, enhanced privacy, absence of trustworthiness and lack of users' willingness to participate in the VENs, which need to be tackled. The major issues in modern day grids are electricity theft [72,73] and detection of non technical losses [74,75]. ...
Thesis
Full-text available
This thesis examines the use of blockchain technology with the Electric Vehicles (EVs) to tackle different issues related to the existing systems like privacy, security, lack of trust, etc., and to promote transparency, data immutability and tamper proof nature. Moreover, in this study, a new and improved charging strategy, termed as Mobile vehicle-to-Vehicle (M2V) charging strategy, is used to charge the EVs. It is further compared with conventional Vehicle-to-Vehicle (V2V) and Grid-to-Vehicle (G2V) charging strategies to prove its efficacy. In the proposed work, the charging of vehicles is done in a Peer-to-Peer (P2P) manner to remove the intermediary parties and deal with the issues related to them. Moreover, to store the data related to traffic, roads and weather conditions, a Transport System Information Unit (TSIU) is used, which helps in reducing road congestion and minimizing road side accidents. In TSIU, InterPlanetary File System (IPFS) is utilized to store the data in a secured manner. Furthermore, mathematical formulation of the total charging cost, the shortest distance between EVs and charging entities, and the time taken to traverse the shortest distance and to charge the vehicles is done using real time data of EVs. The phenomena of range anxiety and coordination at the crossroads are also dealt with in the study. Moving ahead, edge service providers are introduced to ensure efficient service provisioning. These nodes ensure smooth communication with EVs for successful service provisioning. A caching system is also introduced at the edge nodes to store frequently used services. The power flow and the related energy losses for G2V, V2V and M2V charging strategies are also discussed in this work. In addition, an incentive provisioning mechanism is proposed on the basis of timely delivery of credible messages, which further promotes users’ participation. Furthermore, a hybrid blockchain based vehicular announcement scheme is proposed through which secure and reliable announcement dissemination is realized. In addition, IOTA Tangle is used, which ensures decentralization of the system. The real identities of the vehicles are hidden using the pseudo identities generated through an Elliptic Curve Cryptography (ECC) based pseudonym update mechanism. Moreover, the lightweight trustworthiness verification of vehicles is performed using a Cuckoo Filter (CF). It also prevents revealing the reputation values given to the vehicles upon information dissemination. To reduce the delays caused due to inefficient digital signature verification, transactions are verified in the form of batches. Furthermore, a blockchain based revocation transparency enabled data-oriented trust model is proposed. Password Authenticated Key Exchange by Juggling (J-PAKE) scheme is used in the proposed model to enable mutual authentication. To prevent collusion attacks, message credibility check is performed using Real-time Message Content Validation (RMCV) scheme. Furthermore, K-anonymity algorithm is used to anonymize the reputation data and prevent privacy leakage by restricting the identification of the predictable patterns present in the reputation data. To enable revocation transparency, a Proof of Revocation (PoR) is designed for the revoked vehicles. The vehicle records are stored in IPFS. To enhance the chances of correct information dissemination, incentives are provided to the vehicles using a reputation based incentive mechanism. To check the robustness of the proposed model, attacker models are designed and tested against different attacks including selfish mining attack, double spending attack, etc. To prove the efficiency of the proposed work, extensive simulations are performed. The simulation results prove that the proposed study achieves high success in making EVs energy efficient, secure and robust. Furthermore, the security analysis of the smart contracts used in the proposed work is performed using Oyente, which exhibits the secure nature of the proposed work.
... In [7], [12], [13], [14], the authors address that in existing methods, there are no appropriate feature engineering mechanisms presented. The manual feature engineering process is required extra time and domain knowledge. ...
Conference Paper
Full-text available
In this paper, a novel hybrid deep learning approach is proposed to detect the nontechnical losses (NTLs) that occur in smart grids due to illegal use of electricity, faulty meters, meter malfunctioning, unpaid bills, etc. The proposed approach is based on data-driven methods due to the sufficient availability of smart meters' data. Therefore, a bi-directional wasserstein generative adversarial network (Bi-WGAN) is utilized to generate the synthetic theft samples for solving the class imbalance problem. The Bi-WGAN efficiently synthesizes the minority class theft samples by leveraging the capabilities of an additional encoder module. Moreover, the curse of dimensionality degrades the model's generalization ability. Therefore, the high dimensionality issue is solved using the two dimensional convolutional neural network (2D-CNN) and bidirectional long short-term memory network (Bi-LSTM). The 2D-CNN is applied on 2D weekly data to extract the most prominent features. In 2D-CNN, the convolutional and pooling layers extract only the potential features and discard the redundant features to reduce the curse of dimensionality. This process increases the convergence speed of the model as well as reduces the computational overhead. Meanwhile, a Bi-LSTM is also used to detect the non-malicious changes in consumers' load profiles using its strong memorization capabilities. Finally, the outcomes of both models are concatenated into a single feature map and a sigmoid activation function is applied for final NTL detection. The simulation results demonstrate that the proposed model outperforms the existing scheme in terms of mathew correlation coefficient (MCC), precision-recall (PR) and area under the curve (AUC). It achieves 3%, 5% and 4% greater MCC, PR and AUC scores, respectively as compared to the existing model.
Chapter
In this paper, a novel hybrid deep learning approach is proposed to detect the nontechnical losses (NTLs) that occur in smart grids due to illegal use of electricity, faulty meters, meter malfunctioning, unpaid bills, etc. The proposed approach is based on data-driven methods due to the sufficient availability of smart meters’ data. Therefore, a bi-directional wasserstein generative adversarial network (Bi-WGAN) is utilized to generate the synthetic theft samples for solving the class imbalance problem. The Bi-WGAN efficiently synthesizes the minority class theft samples by leveraging the capabilities of an additional encoder module. Moreover, the curse of dimensionality degrades the model’s generalization ability. Therefore, the high dimensionality issue is solved using the two dimensional convolutional neural network (2D-CNN) and bidirectional long short-term memory network (Bi-LSTM). The 2D-CNN is applied on 2D weekly data to extract the most prominent features. In 2D-CNN, the convolutional and pooling layers extract only the potential features and discard the redundant features to reduce the curse of dimensionality. This process increases the convergence speed of the model as well as reduces the computational overhead. Meanwhile, a Bi-LSTM is also used to detect the non-malicious changes in consumers’ load profiles using its strong memorization capabilities. Finally, the outcomes of both models are concatenated into a single feature map and a sigmoid activation function is applied for final NTL detection. The simulation results demonstrate that the proposed model outperforms the existing scheme in terms of mathew correlation coefficient (MCC), precision-recall (PR) and area under the curve (AUC). It achieves 3%, 5% and 4% greater MCC, PR and AUC scores, respectively as compared to the existing model.
Thesis
Full-text available
Smart Grid (SG) is a modernized grid that provides efficient, reliable and economic energy to the consumers. Energy is the most important resource in the world and almost everything relies on it. As smart devices are increasing dramatically with the rapid increase in population, there is a need for an efficient energy distribution mechanism. Furthermore, the forecasting of electricity consumption is supposed to be a major constituent to enhance the performance of the SG. Various learning algorithms have been proposed in the literature for efficient load and price forecasting. However, there exist some issues in the proposed work like increased computational complexity. The sole purpose of the work done in this thesis is to efficiently predict electricity load and price using different techniques with minimum computational complexity. Chapter 1 provides an introduction of various concepts present in the power grids. Afterwards, the unified system model, different sub-problems and the contributions made in the thesis are also presented. Chapter 2 discusses the existing work done by different researchers for performing electricity load and price forecasting. In Chapter 3, Enhanced Logistic Regression (ELR) and Enhanced Recurrent Extreme Learning Machine (ERELM) are proposed for performing short-term load and price forecasting. The former is an enhanced form of Logistic Regression (LR); whereas, the weights and biases of the latter are optimized using Grey Wolf Optimizer (GWO). Classification And Regression Tree (CART), Relief-F and Recursive Feature Elimination (RFE) are used for feature selection and extraction. On the basis of selected features, classification is performed using ELR. Moreover, cross validation is done using Monte Carlo and K-Fold methods. In order to ensure optimal and secure functionality of Micro Grid (MG), Chapter 4 focuses on coordinated energy management of traditional and Renewable Energy Sources (RES). Users and MG with storage capacity are taken into account to perform efficient energy management. A two stage Stackelberg game is formulated. Every player in the game tries to increase its payoff, and ensure user comfort and system reliability. Furthermore, two forecasting techniques are proposed in order to forecast Photo-Voltaic Cell (PVC) generation for announcing optimal prices. Both the existence and uniqueness of Nash Equilibrium (NE) for the energy management algorithm are also considered. In Chapter 5, a novel forecasting model, termed as ELS-net, is proposed. It is a combination of an Ensemble Empirical Mode Decomposition (EEMD) method, multi-model Ensemble Bi Long Short Term Memory (EBiLSTM) forecasting technique and Support Vector Machine (SVM). In the proposed model, EEMD is used to distinguish between linear and non-linear Intrinsic Mode Functions (IMFs). EBiLSTM is used to forecast the non-linear IMFs and SVM is employed to forecast the linear IMFs. The usage of separate forecasting techniques for linear and non-linear IMFs decreases the computational complexity of the model. In Chapter 6, a novel deep learning model, termed as Gated-FCN, is introduced for short-term load forecasting. The key idea is to introduce automated feature selection and a deep learning model for forecasting, which includes an eight layered FCN (FCN-8). It ensures that hand crafted feature selection is avoided as it requires expert domain knowledge. Furthermore, Gated-FCN also helps in reducing noise as it learns internal dependencies as well as the correlation of the time-series. Enhanced Bidirectional Gated Recurrent Unit (EBiGRU) model is dovetailed with FCN-8 in order to learn temporal long-term dependencies of the time-series. Furthermore, weight averaging mechanism of multiple snapshot models is adapted in order to take optimized weights of BiGRU. At the end of FCN-8 and BiGRU, a fully connected dense layer is used that gives final prediction results. The simulations are performed and the results are provided at the end of each chapter. In Chapter 3, the simulations are performed using UMass electric and UCI datasets. ELR shows better performance with the former dataset; whereas, ERELM has better accuracy with the latter. The proposed techniques are then compared with different benchmark schemes. The comparison is done to verify the adaptivity of the proposed techniques. The simulation results show that the proposed techniques outperform the benchmark schemes and increase the prediction accuracy of electricity load and price. Similarly, in Chapter 4, simulations are performed using Elia, Belgium dataset. The results clearly show that the proposed game theoretic approach along with storage capacity optimization and forecasting techniques give benefits to both users and MG. In Chapter 5, simulations are performed to examine the effectiveness of the proposed model using two different datasets: New South Wales (NSW) and Victoria (VIC). From the simulation results, it is obvious that the proposed ELS-net model outperforms the benchmark techniques: EMD-BILSTM-SVM, EMD-PSO-GA-SVR, BiLSTM, MLP and SVM in terms of forecasting accuracy and minimum execution time. Similarly, the simulation results of Chapter 6 depict that Gated-FCN gives maximum forecasting accuracy as compared to the benchmark techniques. For performance evaluation of the proposed work, different performance metrics are used: Mean Absolute Percentage Error (MAPE), Mean Absolute Error (MAE), Mean Squared Error (MSE) and Root Mean Square Error (RMSE). The overall results prove that the work done in this thesis outperforms the existing work in terms of electricity load and price forecasting, and computational complexity.
Chapter
In this research article, we tackle the following limitations: high misclassification rate, low detection rate and, class imbalance problem and no availability of malicious or theft samples. The class imbalanced problem is severe issue in electricity theft detection that affects the performance of supervised learning methods. We exploit the adaptive synthetic minority oversampling technique to tackle this problem. Moreover, theft samples are created from benign samples and we argue that the goal of theft is to report less than consumption actual electricity consumption. Different machine learning and deep learning methods including recently developed light and extreme gradient boosting (XGBoost), are trained and evaluated on a realistic electricity consumption dataset that is provided by an electric utility in Pakistan. The consumers in the dataset belong to different demographics and, different social and financial backgrounds. Different number of classifiers are trained on acquired data; however, long short-term memory (LSTM) and XGBoost attain high performance and outperform all classifiers. The XGBoost achieves a 0.981 detection rate and 0.015 misclassification rate. Whereas, LSTM attains 0.976 and 0.033 detection and misclassification rate, respectively. Moreover, the performance of all implemented classifiers is evaluated through precision, recall, F1-score, etc.
Chapter
In this paper, a data driven based solution is proposed to detect Non-Technical Losses (NTLs) in the smart grids. In the real world, the number of theft samples are less as compared to the benign samples, which leads to data imbalance issue. To resolve the issue, diverse theft attacks are applied on the benign samples to generate synthetic theft samples for data balancing and to mimic real-world theft patterns. Furthermore, several non-malicious factors influence the users’ energy usage patterns such as consumers’ behavior during weekends, seasonal change and family structure, etc. The factors adversely affect the model’s performance resulting in data misclassification. So, non-malicious factors along with smart meters’ data need to be considered to enhance the theft detection accuracy. Keeping this in view, a hybrid Multi-Layer Perceptron and Gated Recurrent Unit (MLP-GRU) based Deep Neural Network (DNN) is proposed to detect electricity theft. The MLP model takes auxiliary data such as geographical information as input while the dataset of smart meters is provided as an input to the GRU model. Due to the improved generalization capability of MLP with reduced overfitting and effective gated configuration of multi-layered GRU, the proposed model proves to be an ideal solution in terms of prediction accuracy and computational time. Furthermore, the proposed model is compared with the existing MLP-LSTM model and the simulations are performed. The results show that MLP-GRU achieves 0.87 and 0.89 score for Area under the Receiver Operating Characterstic Curve and Area under the Precision-Recall Curve (PR-AUC), respectively as compared to 0.72 and 0.47 for MLP-LSTM.
Article
Non-technical losses (NTLs) are one of the major causes of revenue losses for electric utilities. In the literature, various machine learning (ML)/deep learning (DL) approaches are employed to detect NTLs. The existing studies are mostly concerned with tuning the hyperparameters of ML/DL methods for efficient detection of NTL, i.e., electricity theft detection. Some of them focus on the selection of prominent features from data to improve the performance of electricity theft detection. However, the curse of dimensionality affects the generalization ability of ML/DL classifiers and leads to computational, storage, and overfitting problems. Therefore, to deal with the above-mentioned issues, this study proposes a system based on metaheuristic techniques (artificial bee colony and genetic algorithm) and denoising autoencoder for electricity theft detection using big data in electric power systems. The former (metaheuristics) are used to select prominent features, while the latter is utilized to extract high variance features from electricity consumption data. Firstly, 11 new features are synthesized using statistical and electrical parameters from the user’s consumption history. Then, the synthesized features are used as input to metaheuristic techniques to find a subset of optimal features. Finally, the optimal features are fed as input to the denoising autoencoder to extract features with high variance. The ability of both metaheuristic and autoencoder techniques to select and extract features is measured using a support vector machine. The proposed system reduces the overfitting, storage, and computational overhead of ML classifiers. Moreover, we perform several experiments to verify the effectiveness of our proposed system and results reveal that the proposed system has better performance than its counterparts.
Conference Paper
Full-text available
In this research article, we tackle the following limitations: high misclassification rate, low detection rate and, class imbalance problem and no availability of malicious or theft samples. The class imbalanced problem is severe issue in electricity theft detection that affects the performance of supervised learning methods. We exploit the adaptive synthetic minority oversampling technique to tackle this problem. Moreover, theft samples are created from benign samples and we argue that the goal of theft is to report less than consumption actual electricity consumption. Different machine learning and deep learning methods including recently developed light and extreme gradient boosting (XGBoost), are trained and evaluated on are alistic electricity consumption dataset that is provided by an electric utility in Pakistan. The consumers in the dataset belong to different demographics and, different social and financial backgrounds. Different number of classifiers are trained on acquired data; however, long short-term memory (LSTM) and XGBoost attain high performance and outperform all classifiers. The XGBoost achieves a 0.981 detection rate and 0.015misclassification rate. Whereas, LSTM attains 0.976 and 0.033detection and misclassification rate, respectively. Moreover, the performance of all implemented classifiers is evaluated through precision, recall, F1-score, etc.
Article
Full-text available
Enormous amounts of data are being produced everyday by sub-meters and smart sensors installed in residential buildings. If leveraged properly, that data could assist end-users, energy producers and utility companies in detecting anomalous power consumption and understanding the causes of each anomaly. Therefore, anomaly detection could stop a minor problem becoming overwhelming. Moreover, it will aid in better decision-making to reduce wasted energy and promote sustainable and energy efficient behavior. In this regard, this paper is an in-depth review of existing anomaly detection frameworks for building energy consumption based on artificial intelligence. Specifically, an extensive survey is presented, in which a comprehensive taxonomy is introduced to classify existing algorithms based on different modules and parameters adopted, such as machine learning algorithms, feature extraction approaches, anomaly detection levels, computing platforms and application scenarios. To the best of the authors’ knowledge, this is the first review article that discusses anomaly detection in building energy consumption. Moving forward, important findings along with domain-specific problems, difficulties and challenges that remain unresolved are thoroughly discussed, including the absence of: (i) precise definitions of anomalous power consumption, (ii) annotated datasets, (iii) unified metrics to assess the performance of existing solutions, (iv) platforms for reproducibility and (v) privacy-preservation. Following, insights about current research trends are discussed to widen the applications and effectiveness of the anomaly detection technology before deriving future directions attracting significant attention. This article serves as a comprehensive reference to understand the current technological progress in anomaly detection of energy consumption based on artificial intelligence.
Article
Full-text available
Electricity theft is one of the main causes of non-technical losses and its detection is important for power distribution companies to avoid revenue loss. The advancement of traditional grids to smart grids allows a two-way flow of information and energy that enables real-time energy management, billing and load surveillance. This infrastructure enables power distribution companies to automate electricity theft detection (ETD) by constructing new innovative data-driven solutions. Whereas, the traditional ETD approaches do not provide acceptable theft detection performance due to high-dimensional imbalanced data, loss of data relationships during feature extraction and the requirement of experts' involvement. Hence, this paper presents a new semi-supervised solution for ETD, which consists of relational denoising autoencoder (RDAE) and attention guided (AG) TripleGAN, named as RDAE-AG-TripleGAN. In this system, RDAE is implemented to derive features and their associations while AG performs feature weighting and dynamically supervises the AG-TripleGAN. As a result, this procedure significantly boosts the ETD. Furthermore, to demonstrate the acceptability of the proposed methodology over conventional approaches, we conducted extensive simulations using the real power consumption data of smart meters. The proposed solution is validated over the most useful and suitable performance indicators: area under the curve, precision, recall, Matthews correlation coefficient, F1-score and precision-recall area under the curve. The simulation results prove that the proposed method efficiently improves the detection of electricity frauds against conventional ETD schemes such as extreme gradient boosting machine and transductive support vector machine. The proposed solution achieves the detection rate of 0.956, which makes it more acceptable for electric utilities than the existing approaches.
Article
Full-text available
Nowadays, analyzing, detecting, and visualizing abnormal power consumption behavior of householders are among the principal challenges in identifying ways to reduce power consumption. This paper introduces a new solution to detect energy consumption anomalies based on extracting micro-moment features using a rule-based model. The latter is used to draw out load characteristics using daily intent-driven moments of user consumption actions. Besides micro-moment features extraction, we also experiment with a deep neural network architecture for efficient abnormality detection and classification. In the following, a novel anomaly visualization technique is introduced that is based on a scatter representation of the micro-moment classes, and hence providing consumers an easy solution to understand their abnormal behavior. Moreover, in order to validate the proposed system, a new energy consumption dataset at appliance level is also designed through a measurement campaign carried out at Qatar University Energy Lab, namely, Qatar University dataset. Experimental results on simulated and real datasets collected at two regions, which have extremely different climate conditions, confirm that the proposed deep micro-moment architecture outperforms other machine learning algorithms and can effectively detect anomalous patterns. For example, 99.58% accuracy and 97.85% F1 score have been achieved under Qatar University dataset. These promising results establish the efficacy of the proposed deep micro-moment solution for detecting abnormal energy consumption, promoting energy efficiency behaviors, and reducing wasted energy.
Article
Full-text available
Energy consumption is increasing exponentially with the increase in electronic gadgets. Losses occur during generation, transmission, and distribution. The energy demand leads to increase in electricity theft (ET) in distribution side. Data analysis is the process of assessing the data using different analytical and statistical tools to extract useful information. Fluctuation in energy consumption patterns indicates electricity theft. Utilities bear losses of millions of dollar every year. Hardware-based solutions are considered to be the best; however, the deployment cost of these solutions is high. Software-based solutions are data-driven and cost-effective. We need big data for analysis and artificial intelligence and machine learning techniques. Several solutions have been proposed in existing studies; however, low detection performance and high false positive rate are the major issues. In this paper, we first time employ bidirectional Gated Recurrent Unit for ET detection for classification using real time-series data. We also propose a new scheme, which is a combination of oversampling technique Synthetic Minority Oversampling TEchnique (SMOTE) and undersampling technique Tomek Link: “Smote Over Sampling Tomik Link (SOSTLink) sampling technique”. The Kernel Principal Component Analysis is used for feature extraction. In order to evaluate the proposed model’s performance, five performance metrics are used, including precision, recall, F1-score, Root Mean Square Error (RMSE), and receiver operating characteristic curve. Experiments show that our proposed model outperforms the state-of-the-art techniques: logistic regression, decision tree, random forest, support vector machine, convolutional neural network, long short-term memory, hybrid of multilayer perceptron and convolutional neural network.
Article
Full-text available
Unlike the existing research that focuses on detecting electricity theft cyber-attacks in the consumption domain, this paper investigates electricity thefts at the distributed generation (DG) domain. In this attack, malicious customers hack into the smart meters monitoring their renewable-based DG units and manipulate their readings to claim higher supplied energy to the grid and hence falsely overcharge the utility company. Deep machine learning is investigated to detect such a malicious behavior. We aim to answer three main questions in this paper: a) What are the cyber-attack functions that can be applied by malicious customers to the generation data in order to falsely overcharge the utility company? b) What sources of data can be used in order to detect these cyber-attacks by the utility company? c) Which deep machine learning-model should be used in order to detect these cyber-attacks? Our investigation revealed that integrating various data from the DG smart meters, meteorological reports, and SCADA metering points in the training of a deep convolutional-recurrent neural network offers the highest detection rate (99.3%) and lowest false alarm (0.22%).
Article
Full-text available
In the smart grid (SG) environment, consumers are enabled to alter electricity consumption patterns in response to electricity prices and incentives. This results in prices that may differ from the initial price pattern. Electricity price and demand forecasting play a vital role in the reliability and sustainability of SG. Forecasting using big data has become a new hot research topic as a massive amount of data is being generated and stored in the SG environment. Electricity users, having advanced knowledge of prices and demand of electricity, can manage their load efficiently. In this paper, a recurrent neural network (RNN), long short term memory (LSTM), is used for electricity price and demand forecasting using big data. Researchers are working actively to propose new models of forecasting. These models contain a single input variable as well as multiple variables. From the literature, we observed that the use of multiple variables enhances the forecasting accuracy. Hence, our proposed model uses multiple variables as input and forecasts the future values of electricity demand and price. The hyperparameters of this algorithm are tuned using the Jaya optimization algorithm to improve the forecasting ability and increase the training mechanism of the model. Parameter tuning is necessary because the performance of a forecasting model depends on the values of these parameters. Selection of inappropriate values can result in inaccurate forecasting. So, integration of an optimization method improves the forecasting accuracy with minimum user efforts. For efficient forecasting, data is preprocessed and cleaned from missing values and outliers, using the z-score method. Furthermore, data is normalized before forecasting. The forecasting accuracy of the proposed model is evaluated using the root mean square error (RMSE) and mean absolute error (MAE). For a fair comparison, the proposed forecasting model is compared with univariate LSTM and support vector machine (SVM). The values of the performance metrics depict that the proposed model has higher accuracy than SVM and univariate LSTM.
Article
Full-text available
As one of the major factors of the nontechnical losses (NTLs) in distribution networks, the electricity theft causes significant harm to power grids, which influences power supply quality and reduces operating profits. In order to help utility companies solve the problems of inefficient electricity inspection and irregular power consumption, a novel hybrid convolutional neural network-random forest (CNN-RF) model for automatic electricity theft detection is presented in this paper. In this model, a convolutional neural network (CNN) firstly is designed to learn the features between different hours of the day and different days from massive and varying smart meter data by the operations of convolution and downsampling. In addition, a dropout layer is added to retard the risk of overfitting, and the backpropagation algorithm is applied to update network parameters in the training phase. And then, the random forest (RF) is trained based on the obtained features to detect whether the consumer steals electricity. To build the RF in the hybrid model, the grid search algorithm is adopted to determine optimal parameters. Finally, experiments are conducted based on real energy consumption data, and the results show that the proposed detection model outperforms other methods in terms of accuracy and efficiency.
Article
Full-text available
The increasing load demand in residential area and irregular electricity load profile encouraged us to propose an efficient Home Energy Management System (HEMS) for optimal scheduling of home appliances. We propose a multi-objective optimization based solution that shifts the electricity load from On-peak to Off-peak hours according to the defined objective load curve for electricity. It aims to manage the trade-off between conflicting objectives: electricity bill, waiting time of appliances and electricity load shifting according to the defined electricity load pattern. The defined electricity load pattern helps in balancing the load during On-peak and Off-peak hours. Moreover, for real time rescheduling, concept of coordination among home appliances is presented. This helps the scheduler to optimally decide the ON/OFF status of appliances to reduce the waiting time of the appliance. Whereas, electricity consumers have stochastic nature, for which, nature-inspired optimization techniques provide optimal solution. For optimal scheduling, we proposed two optimization techniques: binary multi-objective bird swarm optimization and a hybrid of bird swarm and cuckoo search algorithms to obtain the Pareto front. Moreover, dynamic programming is used to enable coordination among the appliances so that real-time scheduling can be performed by the scheduler on user's demand. To validate the performance of the proposed nature-based optimization techniques, we compare the results of proposed schemes with existing techniques such as multi-objective binary particle swarm optimization and multi-objective cuckoo search algorithms. Simulation results validate the performance of proposed techniques in terms of electricity cost reduction, peak to average ratio and waiting time minimization. Also, test functions for convex, non-convex and discontinuous Pareto front are implemented to prove the efficacy of proposed techniques.
Article
Full-text available
Among an electricity provider's non-technical losses, electricity theft has the most severe and dangerous effects. Fraudulent electricity consumption decreases the supply quality, increases generation load, causes legitimate consumers to pay excessive electricity bills, and affects the overall economy. The adaptation of smart grids can significantly reduce this loss through data analysis techniques. The smart grid infrastructure generates a massive amount of data, including the power consumption of individual users. Utilizing this data, machine learning and deep learning techniques can accurately identify electricity theft users. In this paper, an electricity theft detection system is proposed based on a combination of a convolutional neural network (CNN) and a long short-term memory (LSTM) architecture. CNN is a widely used technique that automates feature extraction and the classification process. Since the power consumption signature is time-series data, we were led to build a CNN-based LSTM (CNN-LSTM) model for smart grid data classification. In this work, a novel data pre-processing algorithm was also implemented to compute the missing instances in the dataset, based on the local values relative to the missing data point. Furthermore, in this dataset, the count of electricity theft users was relatively low, which could have made the model inefficient at identifying theft users. This class imbalance scenario was addressed through synthetic data generation. Finally, the results obtained indicate the proposed scheme can classify both the majority class (normal users) and the minority class (electricity theft users) with good accuracy.
Article
Full-text available
This paper proposes a novel sparsity adaptive simulated annealing algorithm to solve the issue of sparse recovery. This algorithm combines the advantage of the sparsity adaptive matching pursuit (SAMP) algorithm and the simulated annealing method in global searching for the recovery of the sparse signal. First, we calculate the sparsity and the initial support collection as the initial search points of the proposed optimization algorithm by using the idea of SAMP. Then, we design a two-cycle reconstruction method to find the support sets efficiently and accurately by updating the optimization direction. Finally, we take advantage of the sparsity adaptive simulated annealing algorithm in global optimization to guide the sparse reconstruction. The proposed sparsity adaptive greedy pursuit model has a simple geometric structure, it can get the global optimal solution, and it is better than the greedy algorithm in terms of recovery quality. Our experimental results validate that the proposed algorithm outperforms existing state-of-the-art sparse reconstruction algorithms.
Article
Full-text available
The emergence of the smart grid has empowered the consumers to manage the home energy in an efficient and effective manner. In this regard, home energy management (HEM) is a challenging task that requires efficient scheduling of smart appliances to optimize energy consumption. In this paper, we proposed a meta-heuristic based HEM system (HEMS) by incorporating the enhanced differential evolution (EDE) and harmony search algorithm (HSA). Moreover, to optimize the energy consumption, a hybridization based on HSA and EDE operators is performed. Further, multiple knapsacks are used to ensure that the load demand for electricity consumers does not exceed a threshold during peak hours. To achieve multiple objectives at the same time, hybridization proved to be effective in terms of electricity cost and peak to average ratio (PAR) reduction. The performance of the proposed technique; harmony EDE (HEDE) is evaluated via extensive simulations in MATLAB. The simulations are performed for a residential complex of multiple homes with a variety of smart appliances. The simulation results show that EDE performs better in terms of cost reduction as compared to HSA. Whereas, in terms of PAR, HSA is proved to be more efficient as compared to EDE. However, the proposed scheme outperforms the existing meta-heuristic techniques (HSA and EDE) in terms of cost and PAR. 1
Article
Full-text available
The two-way flow of information and energy is an important feature of the Energy Internet. Data analytics is a powerful tool in the information flow that aims to solve practical problems using data mining techniques. As the problem of electricity thefts via tampering with smart meters continues to increase, the abnormal behaviors of thefts become more diversified and more difficult to detect. Thus, a data analytics method for detecting various types of electricity thefts is required. However, the existing methods either require a labeled dataset or additional system information which is difficult to obtain in reality or have poor detection accuracy. In this paper, we combine two novel data mining techniques to solve the problem. One technique is the Maximum Information Coefficient (MIC), which can find the correlations between the non-technical loss (NTL) and a certain electricity behavior of the consumer. MIC can be used to precisely detect thefts that appear normal in shapes. The other technique is the clustering technique by fast search and find of density peaks (CFSFDP). CFSFDP finds the abnormal users among thousands of load profiles, making it quite suitable for detecting electricity thefts with arbitrary shapes. Next, a framework for combining the advantages of the two techniques is proposed. Numerical experiments on the Irish smart meter dataset are conducted to show the good performance of the combined method.
Article
Full-text available
Non-technical electricity losses due to anomalies or frauds are accountable for important revenue losses in power utilities. Recent advances have been made in this area, fostered by the roll-out of smart meters. In this paper, we propose a methodology for non-technical loss detection using supervised learning. The methodology has been developed and tested on real smart meter data of all the industrial and commercial customers of Endesa. This methodology uses all the information the smart meters record (energy consumption, alarms and electrical magnitudes) to obtain an in-depth analysis of the customer’s consumption behavior. It also uses auxiliary databases to provide additional information regarding the geographical location and technological characteristics of each smart meter. The model has been trained, validated and tested on the results of approximately 57000 on-field inspections. It is currently in use in a non-technical loss detection campaign for big customers. Several state-of-the-art classifiers have been tested. The results show that extreme gradient boosted trees outperform the rest of the classifiers.
Article
Full-text available
Electricity theft can be harmful to power grid suppliers and cause economic losses. Integrating information flows with energy flows, smart grids can help to solve the problem of electricity theft owning to the availability of massive data generated from smart grids. The data analysis on the data of smart grids is helpful in detecting electricity theft because of the abnormal electricity consumption pattern of energy thieves. However, the existing methods have poor detection accuracy of electricity-theft since most of them were conducted on one dimensional (1-D) electricity consumption data and failed to capture the periodicity of electricity consumption. In this paper, we originally propose a novel electricity-theft detection method based on Wide & Deep Convolutional Neural Networks (CNN) model to address the above concerns. In particular, Wide & Deep CNN model consists of two components: the Wide component and the Deep CNN component. The Deep CNN component can accurately identify the non-periodicity of electricity-theft and the periodicity of normal electricity usage based on two dimensional (2-D) electricity consumption data. Meanwhile, the Wide component can capture the global features of 1-D electricity consumption data. As a result, Wide & Deep CNN model can achieve the excellent performance in electricity-theft detection. Extensive experiments based on realistic dataset show that Wide & Deep CNN model outperforms other existing methods.
Conference Paper
Full-text available
Non-technical losses (NTL) in electricity distribution are caused by different reasons, such as poor equipment maintenance, broken meters or electricity theft. NTL occurs especially but not exclusively in emerging countries. Developed countries, even though usually in smaller amounts, have to deal with NTL issues as well. In these countries the estimated annual losses are up to six billion USD. These facts have directed the focus of our work to the NTL detection. Our approach is composed of two steps: 1) We compute several features and combine them in sets characterized by four criteria: temporal, locality, similarity and infrastructure. 2) We then use the sets of features to train three machine learning classifiers: random forest, logistic regression and support vector vachine. Our hypothesis is that features derived only from provider-independent data are adequate for an accurate detection of non-technical losses. We used Area Under the Receiver-operating Curve (AUC) to assess the results.
Article
Full-text available
Imbalanced-learn is an open-source python toolbox aiming at providing a wide range of methods to cope with the problem of imbalanced dataset frequently encountered in machine learning and pattern recognition. The implemented state-of-the-art methods can be categorized into 4 groups: (i) under-sampling, (ii) over-sampling, (iii) combination of over- and under-sampling, and (iv) ensemble learning methods. The proposed toolbox only depends on numpy, scipy, and scikit-learn and is distributed under MIT license. Furthermore, it is fully compatible with scikit-learn and is part of the scikit-learn-contrib supported project. Documentation, unit tests as well as integration tests are provided to ease usage and contribution. The toolbox is publicly available in GitHub: https://github.com/scikit-learn-contrib/imbalanced-learn.
Conference Paper
Full-text available
Non-technical losses (NTL) such as electricity theft cause significant harm to our economies, as in some countries they may range up to 40% of the total electricity distributed. Detecting NTLs requires costly on-site inspections. Accurate prediction of NTLs for customers using machine learning is therefore crucial. To date, related research largely ignore that the two classes of regular and non-regular customers are highly imbalanced, that NTL proportions may change and mostly consider small data sets, often not allowing to deploy the results in production. In this paper, we present a comprehensive approach to assess three NTL detection models for different NTL proportions in large real world data sets of 100Ks of customers: Boolean rules, fuzzy logic and Support Vector Machine. This work has resulted in appreciable results that are about to be deployed in a leading industry solution. We believe that the considerations and observations made in this contribution are necessary for future smart meter research in order to report their effectiveness on imbalanced and large real world data sets.
Article
Full-text available
Despite more than two decades of continuous development learning from imbalanced data is still a focus of intense research. Starting as a problem of skewed distributions of binary tasks, this topic evolved way beyond this conception. With the expansion of machine learning and data mining, combined with the arrival of big data era, we have gained a deeper insight into the nature of imbalanced learning, while at the same time facing new emerging challenges. Data-level and algorithm-level methods are constantly being improved and hybrid approaches gain increasing popularity. Recent trends focus on analyzing not only the disproportion between classes, but also other difficulties embedded in the nature of data. New real-life problems motivate researchers to focus on computationally efficient, adaptive and real-time methods. This paper aims at discussing open issues and challenges that need to be addressed to further develop the field of imbalanced learning. Seven vital areas of research in this topic are identified, covering the full spectrum of learning from imbalanced data: classification, regression, clustering, data streams, big data analytics and applications, e.g., in social media and computer vision. This paper provides a discussion and suggestions concerning lines of future research for each of them.
Article
Full-text available
Receiver Operating Characteristics (ROC) graphs are useful for organizing classi-fiers and visualizing their performance. ROC graphs are commonly used in medical decision making, and in recent years have been used increasingly in machine learning and data mining research. Although ROC graphs are apparently simple, there are some common misconceptions and pitfalls when using them in practice. The purpose of this article is to serve as an introduction to ROC graphs and as a guide for using them in research.
Article
Full-text available
Classification of data with imbalanced class distribution has encountered a significant drawback of the performance attainable by most standard classifier learning algorithms which assume a relatively balanced class distribution and equal misclassification costs. This paper provides a review of the classification of imbalanced data regarding: the application domains; the nature of the problem; the learning difficulties with standard classifier learning algorithms; the learning objectives and evaluation measures; the reported research solutions; and the class imbalance problem in the presence of multiple classes.
Article
Full-text available
In many engineering optimization problems, the number of function evaluations is severely limited by time or cost. These problems pose a special challenge to the field of global optimization, since existing methods often require more function evaluations than can be comfortably afforded. One way to address this challenge is to fit response surfaces to data collected by evaluating the objective and constraint functions at a few points. These surfaces can then be used for visualization, tradeoff analysis, and optimization. In this paper, we introduce the reader to a response surface methodology that is especially good at modeling the nonlinear, multimodal functions that often occur in engineering. We then show how these approximating functions can be used to construct an efficient global optimization algorithm with a credible stopping rule. The key to using response surfaces for global optimization lies in balancing the need to exploit the approximating surface (by sampling where it is minimized) with the need to improve the approximation (by sampling where prediction error may be high). Striking this balance requires solving certain auxiliary problems which have previously been considered intractable, but we show how these computational obstacles can be overcome.
Article
Full-text available
Many different machine learning algorithms exist; taking into account each algorithm's hyperparameters, there is a staggeringly large number of possible alternatives overall. We consider the problem of simultaneously selecting a learning algorithm and setting its hyperparameters, going beyond previous work that addresses these issues in isolation. We show that this problem can be addressed by a fully automated approach, leveraging recent innovations in Bayesian optimization. Specifically, we consider a wide range of feature selection techniques (combining 3 search and 8 evaluator methods) and all classification approaches implemented in WEKA, spanning 2 ensemble methods, 10 meta-methods, 27 base classifiers, and hyperparameter settings for each classifier. On each of 21 popular datasets from the UCI repository, the KDD Cup 09, variants of the MNIST dataset and CIFAR-10, we show classification performance often much better than using standard selection/hyperparameter optimization methods. We hope that our approach will help non-expert users to more effectively identify machine learning algorithms and hyperparameter settings appropriate to their applications, and hence to achieve improved performance.
Article
Full-text available
Classifier learning with data-sets that suffer from imbalanced class distributions is a challenging problem in data mining community. This issue occurs when the number of examples that represent one class is much lower than the ones of the other classes. Its presence in many real-world applications has brought along a growth of attention from researchers. In machine learning, the ensemble of classifiers are known to increase the accuracy of single classifiers by combining several of them, but neither of these learning techniques alone solve the class imbalance problem, to deal with this issue the ensemble learning algorithms have to be designed specifically. In this paper, our aim is to review the state of the art on ensemble techniques in the framework of imbalanced data-sets, with focus on two-class problems. We propose a taxonomy for ensemble-based methods to address the class imbalance where each proposal can be categorized depending on the inner ensemble methodology in which it is based. In addition, we develop a thorough empirical comparison by the consideration of the most significant published approaches, within the families of the taxonomy proposed, to show whether any of them makes a difference. This comparison has shown the good behavior of the simplest approaches which combine random undersampling techniques with bagging or boosting ensembles. In addition, the positive synergy between sampling techniques and bagging has stood out. Furthermore, our results show empirically that ensemble-based algorithms are worthwhile since they outperform the mere use of preprocessing techniques before learning the classifier, therefore justifying the increase of complexity by means of a significant enhancement of the results.
Article
Reference architectures for big data and machine learning include not only interconnected building blocks but important considerations (among others) for scalability, manageability and usability issues as well. Leveraging on such reference architectures, the automated deployment of distributed toolsets and frameworks on various clouds is still challenging due to the diversity of technologies and protocols. The paper focuses particularly on the widespread Apache Spark cluster with Jupyter as the particularly addressed framework, and the Occopus cloud‐agnostic orchestrator tool for automating its deployment and maintenance stages. The presented approach has been demonstrated and validated with a new, promising text classification application on the Hungarian academic research infrastructure, the OpenStack‐based MTA Cloud. The paper explains the concept, the applied components, and illustrates their usage with real use‐case measurements.
Article
The theft of electricity affects power supply quality and safety of grid operation, and non-technical losses (NTL) have become the major reason of unfair power supply and economic losses for power companies. For more effective electricity theft inspection, an electricity theft detection method based on similarity measure and decision tree combined K-Nearest Neighbor and support vector machine (DT-KSVM) is proposed in the paper. Firstly, the condensed feature set is devised based on feature selection strategy, typical power consumption characteristic curves of users are obtained based on kernel fuzzy C-means algorithm (KFCM). Next, to solve the problem of lack of stealing data and realize the reasonable use of advanced metering infrastructure (AMI). One dimensional Wasserstein generative adversarial networks (1D-WGAN) is used to generate more simulated stealing data. Then the numerical and morphological features in the similarity measurement process are comprehensively considered to conduct preliminary detection of NTL. And DT-KSVM is used to perform secondary detection and identify suspicious customers. At last, simulation experiments verify the effectiveness of the proposed method.
Article
This article proposes a random‐forest based A2Cloud framework to match scientific applications with Cloud providers and their instances for high performance. The framework leverages four engines for this task: PERF engine, Cloud trace engine, A2Cloud‐ext engine, and the random forest classifier (RFC) engine. The PERF engine profiles the application to obtain performance characteristics, including the number of single‐precision (SP) floating‐point operations (FLOPs), double‐precision (DP) FLOPs, x87 operations, memory accesses, and disk accesses. The Cloud trace engine obtains the corresponding performance characteristics of the selected Cloud instances including: SP floating point operations per second (FLOPS), DP FLOPS, x87 operations per second, memory bandwidth, and disk bandwidth. The A2Cloud‐ext engine uses the application and Cloud instance characteristics to generate objective scores that represent the application‐to‐Cloud match. The RFC engine uses these objective scores to generate two types of random forests to assist users with rapid analysis: application‐specific random forests (ARF) and application‐class based random forests. The ARF consider only the input application's characteristics to generate a random forest and provide numerical ratings to the selected Cloud instances. To generate the application‐class based random forests, the RFC engine downloads the application profiles and scores of previously tested applications that perform similar to the input application. Using these data, the RFC engine creates a random forest for instance recommendation. We exhaustively test this framework using eight real‐world applications across 12 instances from different Cloud providers. Our tests show significant statistical agreement between the instance ratings given by the framework and the ratings obtained via actual Cloud executions.
Article
Non-technical losses in electricity utilities are responsible for major revenue losses. In this paper, we propose a novel end-to-end solution to self-learn the features for detecting anomalies and frauds in smart meters using a hybrid deep neural network. The network is fed with simple raw data, removing the need of handcrafted feature engineering. The proposed architecture consists of a long short-term memory network and a multi-layer perceptrons network. The first network analyses the raw daily energy consumption history whilst the second one integrates non-sequential data such as its contracted power or geographical information. The results show that the hybrid neural network significantly outperforms state-of-the-art classifiers as well as previous deep learning models used in non-technical losses detection. The model has been trained and tested with real smart meter data of Endesa, the largest electricity utility in Spain.
Article
Despite many potential advantages, Advanced Metering Infrastructures have introduced new ways to falsify meter readings and commit electricity theft. This study contributes a new model-agnostic, feature-engineering framework for theft detection in smart grids. The framework introduces a combination of Finite Mixture Model clustering for customer segmentation and a Genetic Programming algorithm for identifying new features suitable for prediction. Utilizing demand data from more than 4000 households, a Gradient Boosting Machine algorithm is applied within the framework, significantly outperforming the results of prior machine-learning, theft-detection methods. This study further examines some important practical aspects of deploying theft detection including: the detection delay; the required size of historical demand data; the accuracy in detecting thefts of various types and intensity; detecting irregular and unseen attacks; and the computational complexity of the detection algorithm.
Article
For the smart grid energy theft identification, this letter introduces a gradient boosting theft detector (GBTD) based on the three latest gradient boosting classifiers (GBCs): extreme gradient boosting (XGBoost), categorical boosting (CatBoost), and light gradient boosting method (LightGBM). While most of existing ML algorithms just focus on fine tuning the hyperparameters of the classifiers, our ML algorithm, GBTD, focuses on the feature engineering-based preprocessing to improve detection performance as well as time-complexity. GBTD improves both detection rate (DR) and false positive rate (FPR) of those GBCs by generating stochastic features like standard deviation, mean, minimum, and maximum value of daily electricity usage. GBTD also reduces the classifier complexity with weighted feature-importance (WFI) based extraction techniques. Emphasis has been laid upon the practical application of the proposed ML for theft detection by minimizing FPR and reducing data storage space and improving time-complexity of the GBTD classifiers. Additionally, this letter proposes an updated version of the existing six theft cases to mimic real world theft patterns and applies them to the dataset for numerical evaluation of the proposed algorithm.
Article
The illegal use of electricity, defective meters, and a malfunctioning infrastructure are major causes of Non-Technical Losses (NTLs) in electric distribution systems. Although the use of supervised machine learning techniques to detect NTLs has been widely studied, further research is needed in order to address some significant challenges. (i) Given that fraudulent consumers remarkably outnumber non-fraudulent ones, the imbalanced nature of the dataset can have a major negative impact on the performance of supervised machine learning methods. (ii) Given the large number of dimensions present in the time series data used for training and testing classifiers, advanced signal processing techniques are required in order to extract the most relevant information. (iii) The effectiveness of classifiers must be evaluated using meaningful performance measures for imbalanced data. This paper proposes a framework that addresses the three previous challenges. The core of the proposed framework is the application of the Maximal Overlap Discrete Wavelet-Packet Transform (MODWPT) for feature extraction from time series data and the Random Undersampling Boosting (RUSBoost) algorithm for NTL detection. Moreover, our framework is evaluated using an extensive list of performance metrics. Experiments show that the MODWPT combined with the RUSBoost algorithm can significantly improve the quality of NTL predictions.
Article
Transactional memory (TM) is a programming paradigm that facilitates parallel programming for multi-core processors. In the last few years, some chip manufacturers provided hardware support for TM to reduce runtime overhead of Software Transactional Memory (STM). In this work, we offer two optimization techniques for TMs. The first technique focuses on Restricted Transactional Memory (RTM) in Intel's Haswell processor and shows that while in some applications, RTM improves performance over STM, in some others, it falls behind STM. We exploit this variability and propose an adaptive technique that switches between RTM and STM, statically. The second technique focuses on the overhead of TM and enhances the speed of the adaptive system. In particular, we focus on the size of transactions and improve performance by changing the transaction size. Optimizing the transaction size manually is a time-consuming process and requires significant software engineering effort. We use a combination of Linear Regression (LR) and decision tree to decide on the transaction size, automatically. We evaluate our optimization techniques using a set of benchmarks from NAS, DiscoPoP, and STAMP benchmark suites. Our experimental results reveal that our optimization techniques are able to improve the performance of TM programs by 9% and energy-delay by 15%, on average.
Conference Paper
Fraud detection in electricity consumption is a major challenge for power distribution companies. While many pattern recognition techniques have been applied to identify electricity theft, they often require extensive handcrafted feature engineering. Instead, through deep layers of transformation, nonlinearity, and abstraction, Deep Learning (DL) automatically extracts key features from data. In this paper, we design spatial and temporal deep learning solutions to identify nontechnical power losses (NTL), including Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM) and Stacked Autoencoder. These models are evaluated in a modified IEEE 123-bus test feeder. For the same tests, we also conduct comparison experiments using three conventional machine learning approaches: Random Forest, Decision Trees and shallow Neural Networks. Experimental results demonstrate that the spatiotemporal deep learning approaches outperform conventional machine learning approaches.
Conference Paper
Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.
Article
Classification of data with imbalanced class distribution has encountered a significant drawback of the performance attainable by most standard classifier learning algorithms which assume a relatively balanced class distribution and equal misclassification costs. This paper provides a review of the classification of imbalanced data regarding: the application domains; the nature of the problem; the learning difficulties with standard classifier learning algorithms; the learning objectives and evaluation measures; the reported research solutions; and the class imbalance problem in the presence of multiple classes.
Article
Nontechnical losses, particularly due to electrical theft, have been a major concern in power system industries for a long time. Large-scale consumption of electricity in a fraudulent manner may imbalance the demand-supply gap to a great extent. Thus, there arises the need to develop a scheme that can detect these thefts precisely in the complex power networks. So, keeping focus on these points, this paper proposes a comprehensive top-down scheme based on decision tree (DT) and support vector machine (SVM). Unlike existing schemes, the proposed scheme is capable enough to precisely detect and locate real-time electricity theft at every level in power transmission and distribution (T&D). The proposed scheme is based on the combination of DT and SVM classifiers for rigorous analysis of gathered electricity consumption data. In other words, the proposed scheme can be viewed as a two-level data processing and analysis approach, since the data processed by DT are fed as an input to the SVM classifier. Furthermore, the obtained results indicate that the proposed scheme reduces false positives to a great extent and is practical enough to be implemented in real-time scenarios.
Article
Model selection and hyperparameter optimization is crucial in applying machine learning to a novel dataset. Recently, a subcommunity of machine learning has focused on solving this problem with Sequential Model-based Bayesian Optimization (SMBO), demonstrating substantial successes in many applications. However, for computationally expensive algorithms the overhead of hyperparameter optimization can still be prohibitive. In this paper we mimic a strategy human domain experts use: speed up optimization by starting from promising configurations that performed well on similar datasets. The resulting initialization technique integrates naturally into the generic SMBO framework and can be trivially applied to any SMBO method. To validate our approach, we perform extensive experiments with two established SMBO frameworks (Spearmint and SMAC) with complementary strengths; optimizing two machine learning frameworks on 57 datasets. Our initialization procedure yields mild improvements for low-dimensional hyperparameter optimization and substantially improves the state of the art for the more complex combined algorithm selection and hyperparameter optimization problem.
Article
Hyperparameter learning has traditionally been a manual task because of the limited number of trials. Today's computing infrastructures allow bigger evaluation budgets, thus opening the way for algorithmic approaches. Recently, surrogate-based optimization was successfully applied to hyperparameter learning for deep belief networks and to WEKA classifiers. The methods combined brute force computational power with model building about the behavior of the error function in the hyperparameter space, and they could significantly improve on manual hyperparameter tuning. What may make experienced practitioners even better at hyperparameter optimization is their ability to generalize across similar learning problems. In this paper, we propose a generic method to incorporate knowledge from previous experiments when simultaneously tuning a learning algorithm on new problems at hand. To this end, we combine surrogate-based ranking and optimization techniques for surrogate-based collaborative tuning (SCoT). We demonstrate SCoT in two experiments where it outperforms standard tuning techniques and single-problem surrogate-based optimization.
Article
As one of the key components of the smart grid, advanced metering infrastructure brings many potential advantages such as load management and demand response. However, computerizing the metering system also introduces numerous new vectors for energy theft. In this paper, we present a novel consumption pattern-based energy theft detector, which leverages the predictability property of customers' normal and malicious consumption patterns. Using distribution transformer meters, areas with a high probability of energy theft are short listed, and by monitoring abnormalities in consumption patterns, suspicious customers are identified. Application of appropriate classification and clustering techniques, as well as concurrent use of transformer meters and anomaly detectors, make the algorithm robust against nonmalicious changes in usage pattern, and provide a high and adjustable performance with a low-sampling rate. Therefore, the proposed method does not invade customers' privacy. Extensive experiments on a real dataset of 5000 customers show a high performance for the proposed method.
Article
The Maximum Likelihood (ML) and Cross Validation (CV) methods for estimating covariance hyper-parameters are compared, in the context of Kriging with a misspecified covariance structure. A two-step approach is used. First, the case of the estimation of a single variance hyper-parameter is addressed, for which the fixed correlation function is misspecified. A predictive variance based quality criterion is introduced and a closed-form expression of this criterion is derived. It is shown that when the correlation function is misspecified, the CV does better compared to ML, while ML is optimal when the model is well-specified. In the second step, the results of the first step are extended to the case when the hyper-parameters of the correlation function are also estimated from data.
Machine learning algorithms frequently require careful tuning of model hyperparameters, regularization terms, and optimization parameters. Unfortunately, this tuning is often a "black art" that requires expert experience, unwritten rules of thumb, or sometimes brute-force search. Much more appealing is the idea of developing automatic approaches which can optimize the performance of a given learning algorithm to the task at hand. In this work, we consider the automatic tuning problem within the framework of Bayesian optimization, in which a learning algorithm's generalization performance is modeled as a sample from a Gaussian process (GP). The tractable posterior distribution induced by the GP leads to efficient use of the information gathered by previous experiments, enabling optimal choices about what parameters to try next. Here we show how the effects of the Gaussian process prior and the associated inference procedure can have a large impact on the success or failure of Bayesian optimization. We show that thoughtful choices can lead to results that exceed expert-level performance in tuning machine learning algorithms. We also describe new algorithms that take into account the variable cost (duration) of learning experiments and that can leverage the presence of multiple cores for parallel experimentation. We show that these proposed algorithms improve on previous automatic procedures and can reach or surpass human expert-level optimization on a diverse set of contemporary algorithms including latent Dirichlet allocation, structured SVMs and convolutional neural networks.
Article
In this paper we investigate the use of the area under the receiver operating characteristic (ROC) curve (AUC) as a performance measure for machine learning algorithms. As a case study we evaluate six machine learning algorithms (C4.5, Multiscale Classifier, Perceptron, Multi-layer Perceptron, k-Nearest Neighbours, and a Quadratic Discriminant Function) on six "real world" medical diagnostics data sets. We compare and discuss the use of AUC to the more conventional overall accuracy and find that AUC exhibits a number of desirable properties when compared to overall accuracy: increased sensitivity in Analysis of Variance (ANOVA) tests; a standard error that decreased as both AUC and the number of test samples increased; decision threshold independent; and it is invariant to a priori class probabilities. The paper concludes with the recommendation that AUC be used in preference to overall accuracy for "single number" evaluation of machine learning algorithms. © 1997 Pattern Recognition Society. Published by Elsevier Science Ltd.