Conference PaperPDF Available

Abstract and Figures

In this research article, we tackle the following limitations: high misclassification rate, low detection rate and, class imbalance problem and no availability of malicious or theft samples. The class imbalanced problem is severe issue in electricity theft detection that affects the performance of supervised learning methods. We exploit the adaptive synthetic minority oversampling technique to tackle this problem. Moreover, theft samples are created from benign samples and we argue that the goal of theft is to report less than consumption actual electricity consumption. Different machine learning and deep learning methods including recently developed light and extreme gradient boosting (XGBoost), are trained and evaluated on are alistic electricity consumption dataset that is provided by an electric utility in Pakistan. The consumers in the dataset belong to different demographics and, different social and financial backgrounds. Different number of classifiers are trained on acquired data; however, long short-term memory (LSTM) and XGBoost attain high performance and outperform all classifiers. The XGBoost achieves a 0.981 detection rate and 0.015misclassification rate. Whereas, LSTM attains 0.976 and 0.033detection and misclassification rate, respectively. Moreover, the performance of all implemented classifiers is evaluated through precision, recall, F1-score, etc.
Content may be subject to copyright.
1
Comparative Study of Data Driven Approaches
towards Efficient Electricity Theft Detection in
Micro Grids
Faisal Shehzad1, Muhammad Asif1, Zeeshan Aslam2,
Shahzaib Anwar3, Hamza Rashid4, Muhammad Ilyas3, Nadeem Javaid1,
1Department of Computer Science, COMSATS University Islamabad, Islamabad, Pakistan
2Department of Computer Science, Bahria University Islamabad, Islamabad, Pakistan
3Department of Electrical & Computer Engineering, COMSATS University Islamabad, Islamabad, Pakistan
4University of Management & Technology, Lahore, Pakistan
Email: faisalaslam156@gmail.com, muhammad.asif.comsat@gmail.com, zeeshanxh@gmail.com,
shazaibanwar1@gmail.com, hamzarashid84@gmail.com, milyas3731@gmail.com,
Corresponding author: nadeemjavaidqau@gmail.com; www.njavaid.com.
Abstract—In this research article, we tackle the following
limitations: high misclassification rate, low detection rate and,
class imbalance problem and no availability of malicious or theft
samples. The class imbalanced problem is severe issue in elec-
tricity theft detection that affects the performance of supervised
learning methods. We exploit the adaptive synthetic minority
oversampling technique to tackle this problem. Moreover, theft
samples are created from benign samples and we argue that
the goal of theft is to report less than consumption actual
electricity consumption. Different machine learning and deep
learning methods including recently developed light and extreme
gradient boosting (XGBoost), are trained and evaluated on a
realistic electricity consumption dataset that is provided by an
electric utility in Pakistan. The consumers in the dataset belong
to different demographics and, different social and financial
backgrounds. Different number of classifiers are trained on
acquired data; however, long short-term memory (LSTM) and
XGBoost attain high performance and outperform all classi-
fiers. The XGBoost achieves a 0.981 detection rate and 0.015
misclassification rate. Whereas, LSTM attains 0.976 and 0.033
detection and misclassification rate, respectively. Moreover, the
performance of all implemented classifiers is evaluated through
precision, recall, F1-score, etc.
Index Terms—Data driven approaches, ADASYN, ETD, NTL,
Machine learning, Deep learning, FPR, TPR.
I. BACK GROU ND S TU DY
The objective of electricity theft detection (ETD) is to
detect electricity losses in a smart grid. There are two types
of losses: technical losses and non-technical losses (NTL).
Former occurs due to energy dissipation in transmission lines,
transformers and other types of electric equipment. The latter
happen due to illegal activities: meter tampering, meter by-
passing, using shunt devices, etc., [1]. In literature, different
approaches are designed to detect NTL in the smart grid:
game theory, hardware-based and data-driven approaches. In
the following articles, the authors use multiple data-driven
approaches to analyse users’ consumption and extract the
abnormal patterns. Joker et al. [2] suggest a consumption
pattern-based electricity theft detector to detect abnormal
patterns. Irish electricity dataset is used to train the classifier,
that contains information about only benign1consumers. The
authors propose six theft cases to generate malicious samples
and argue that abnormal electricity consumption is always
less than the normal consumption. However, they use the
oversampling technique to handle the class imbalanced ratio. It
generates duplicated copies of minority class samples, which
affect the learning rate of the classifier. Rajiv et al. [3] present
a comparison between gradient boosting classifiers: eXtreme
gradient boosting (XGBoost), light gradient boosting (Light-
GBM) and categorical boosting (CatBoost), and support vector
machine (SVM). Irish smart meter data is used that contains
5% to 8% theft samples. New theft samples are generated
through the existing six theft cases, which are described in
[2] to balance the dataset. In [4]–[6], authors use supervised
machine learning techniques to detect unusual consumption
behaviour from electricity consumption dataset.
In [7], Razavi et al. develop an agnostic model for feature
extraction through finite mixture model and genetic evolu-
tionary algorithm. Xiangyu et al. [8] propose a combined
supervised and unsupervised-based method to detect illegal
consumers. Dynamic time wrapping similarity measure is
used for preliminary detection and then data is passed to
hybrid SVM and K-nearest neighbour (KNN) based model to
differentiate between normal and abnormal patterns. To tackle
the class imbalance problem, synthesized electricity theft pat-
terns are generated through Wasserstein generative adversarial
network. Finally, features are selected and extracted through
clustering evaluation criteria and autoencoder, respectively.
Madalina et al. [9] use XGBoost, electricity consumption
data and smart meter information to detect NTL in Spain.
K-means clustering, euclidean and Manhattan distance are
utilized to generate new features. Recent and old anomalies are
1Benign and normal samples are used alternatively.
2
captured through z-score and local outlier factor, respectively.
The authors train several ML algorithms with new generated
features, smart meter data and compare their results.
Buzau et al. [10] propose a hybrid model, which is a
combination of long short term memory (LSTM) and mul-
tilayer perceptron (MLP). Former is validated and tested on
sequential data. Whereas, latter is validated and tested on non-
sequential data. Dataset used in this article has high imbal-
anced nature, which tends the model towards the majority
class and generates false results. Zheng et al. [11] propose
a Wide and deep convolutional neural network to detect
electricity theft. Sequential data (weekly data) is passed to a
convolutional neural network (CNN) to capture local patterns.
While, 1D-data is fed in an MLP model (wide component)
to retrieve global patterns from electricity consumption data.
However, they do not handle class imbalance ratio, which
biases the classifier towards the majority class and increases
misclassification rate. Moreover, MLP model is designed for
tabular data, not for sequential data. In [12], Huang et al.
develop stacked sparse denoising autoencoder (SSDAE) to
capture abnormal patterns from electricity consumption data.
SSDAE extracts optimal features, compare them with original
ones and reduces reconstruction loss. The authors introduce
sparsity and noise to enhance the feature extraction ability
of the autoencoder and its hyperparameters are tuned using
the particle swarm optimization technique. SSDAE achieves
a 91.74% detection rate and 7.19% false positive rate (FPR),
which is better than state-of-the-art classifiers. Fenza et al. [13]
introduce an idea, which is based on drift concept and time-
aware system. Their proposed framework contains LSTM and
K-means clustering algorithms. K-means clustering algorithm
opts to find similar consumption behavior. LSTM is used
to predict user’s consumption and compare it with actual
consumption to detect the anomaly.
Rajendra et al. [14] use LSTM, CNN and stacked autoen-
coder to detect abnormal consumption patterns in distributed
systems. The authors use a synthesized dataset that contains
7% abnormal samples. CNN classifier outperforms the LSTM
and stacked autoencoder. Hassan et al. [15] propose CNN-
LSTM based hybrid model. CNN is employed to extract
the optimal features and LSTM is used for the classifica-
tion task. Dataset has an imbalance ratio, which is solved
through synthetic minority oversampling technique (SMOTE).
However, the authors do not consider the FPR. Since high
FPR is very costly because utilities have a limited budget
for on-site inspection. In [16], despite of extensive usage
of machine learning (ML) techniques, the authors do not
focus on the selection of optimal features. In [1], [17], the
authors give possibilities of implementing ML classifiers for
detection of NTL and describe the advantages of selecting
optimal features and their impacts on classifier performance.
One of the main challenges [18] that limits the classifica-
tion ability of existing methods is high dimensionality of
data. Moreover, smart meters greatly improve data collection
procedures and provide high dimensionality data to capture
complex patterns. However, research work shows that the most
existing classification methods are based on conventional ML
classifiers like ANN, SVM and decision tree, which have
limited generalization ability and are unable to learn complex
patterns in high dimensional data.
The list of contributions are explained below.
The legitimate data of any consumer is collected through
consumption history. However, it is very difficult to attain
malicious samples because theft cases are rarely hap-
pened and may not be presented in the user’s consumption
history. So, we apply six theft cases to generate malicious
samples and argue that the motive of theft is to report less
consumption than actual usage.
After generating malicious samples, adaptive synthetic
minority oversampling (ADASYN) technique is exploited
to tackle the class imbalanced problem. This problem
creates a biased model which leads to high FPR.
We conduct an empirical study to compare the perfor-
mance of different machine learning and deep learning
models: XGBoost, CatBoost, SVM, KNN, MLP, LSTM
and RF.
We train and test all classifiers through a realistic elec-
tricity dataset that is provided by PRECON. Moreover,
different measures such as confusion matrix, precision,
recall, F1-score, etc. are used to evaluate their perfor-
mance.
II. ACQUIRING DATASET AND HANDLING CLASS
IMBALANCED PROBLEM
PRECON2is the first kind of dataset that belongs to users
of a developing country. Data collection aims to understand
the electricity consumption behavior of users belonging to
different demographics, and different social and financial
backgrounds. Data is collected for the period of one year
and contains electricity consumption information of thirty-two
houses. The people who participated in this research have
installed smart meters in their houses and agreed to take
part in this research. So, it is a valid assumption that all
participants are honest consumers. The large variety of con-
sumers, measurement of long periods and online availability
make this dataset an excellent source for research on smart
grids. It contains thirty-two files with electricity measurement
after each second. We reduce the data granularity by taking
one sample after each half-hour because high granularity
requires high computation power and affects consumer privacy.
However, the dataset contains only honest consumers. Fig. 1a
shows the electricity consumption of a benign consumer.
For analysis of electricity consumption dataset, we require
the theft samples but theft may be completely absent in users’
consumption history. We solve the lack of theft samples issue
by generating malicious samples from benign samples because
goal of electricity theft is to report less consumption or shift
load from on-peak hours to off-peak hours. In [2], Joker et
al. introduced six cases to generate malicious samples using
benign ones. Description of these theft cases is given below. If
xtis the real consumption of a normal consumer (t[1-48]).
t1(xt) = αxtα=random(0.1,0.8) (1)
2PRECON: PAKISTAN RESIDENTIAL ELECTRICITY CONSUMP-
TION DATASET
3
(a) Daily consumption of a normal consumer
(b) Consumption behaviour of a theft consumer
t2(xt) = βxtβ=random[0.1,0.8] (2)
t3(xt) = γtxtγt=random(0.1,0.8) (3)
t4(xt) = βmean(x)β=random(0.1,0.8) (4)
t5(xt) = mean(x)(5)
t6(xt) = x241(6)
t1(·)multiplies a random number between 0.1 and 0.8 with
meter reading and sends it to an electric utility. t2(·)sends
or does not send measurements at different intervals of time.
t3(·)multiplies each meter reading with a different random
number and reports lower consumption. t4(·)and t5(·)report
mean random factor or exact mean value of measurements.
t6(·)reverses the order of consumption and shifts load towards
peak hours. t5(·)and t6(·)launch attacks against load control
mechanism and report high consumption in off-peak hours and
low consumption in on-peak hours. We apply all these cases
on benign samples and generate the malicious samples. Fig.
1b shows the normal consumer consumption after applying the
six theft cases.
We have 321 normal consumption records and each theft
case generates 321 new theft patterns. The total number
of theft patterns is 1926, which are more than the normal
consumers. This situation creates the class imbalance ratio
that is a critical problem in ETD where one class (honest
consumers) is dominant to other class (theft consumers). Data
is not normally distributed and is skewed towards the majority
class. A machine learning model is applied on data imbalance
dataset. It would be biased towards majority class and not learn
importance features of the minority class, which increases the
FPR. Traditionally, two sampling techniques: undersampling
and oversampling are used to balance dataset. However, these
approaches are not adopted due to computational overhead,
information loss and duplication of existing data. In the
manuscript, we opt Adaptive Synthetic Sampling Approach
(ADASYN) method to address class imbalance ratio [19], [20].
ADASYN is baised on adaptive learning approach, where it
focuses on minority class samples, which are difficult to learn
rather those that are easy to learn. It can not only reduce bias
learning due to data imbalance distribution but also shifts class
boundary towards minority class that are difficult in learning.
Below paragraph shows working process of ADASYN.
Training data Dtr with m samples {xi,yi}where, i= 1...m, xi
belongs to X with n dimensions space. yiY = [0,1] where, 0
and 1 are associated identity labels of xi.msand mlrepresent
the number of minority and majority class samples.
Calculate the degree of imbalance ratio.
d=ms
ml
(7)
d(0,1]
d¡ds(dsis preset threshold for maximum to tolerate
minority class)
Calculate the number of synthetic samples, which are
needed to generate for minority class.
G= (mlms)β(8)
β in [0,1]
βis parameter which, decides balance level between
minority and majority class samples. if β= 1, then data
imbalance problem is fully resolved.
For xiin minority class, find KNN.
ri=δi
K, i = 1 ..., m (9)
δiis the number of samples that belong to the majority
class.
Normalize ri
ri=ri
Pms
i=1 ri
(10)
xiis a density distribution function Pms
i=1 ri= 1
Calculate the number of synthetic samples, which are
generated for each minority class xi
gi=riG(11)
Gis total number of minority samples that needed to be
generated.
For each xi, generate ginumber of sample using
following steps.
Do Loop 1 to gi:
4
Randomly selects one sample xzi from KNN of xi
si=xi+ (xixzi)λ(12)
End Loop
λ= [0,1]
siis a new minority class sample. It is added at end of
the dataset
The idea of ADASYN is based on density distribution function
in, which riautomatically gives higher weights to minority
class samples that are difficult in learning. This is a major
difference to SMOTE. SMOTE technique is an oversampling
techniques. It gives the equal weight to each minority class
sample.
III. STATE-OF-T HE -ART NTL DETECTION TECHNIQUES
There are different techniques that are used for NTL de-
tection: data-driven, game theory and hardware approaches.
Fig. 2 shows the flowchart of these approaches. We will
focus on supervised learning approaches that have applica-
tions in different fields like bioinformatics, object detection,
text classification, spam and anomaly detection, etc. In this
manuscript, we use supervised learning approaches to detect
anomalies in electricity consumption patterns and compare
results to evaluate their performance. The description of these
approaches is given below.
Techniques for NTL
detection
Hardware based
techniques
Data driven techiques
Game theory
techniques
Supervised learning Unsupervised
learning
Anamoly detection,
Heuristic approaches,
Generative models,
Graph based
methods
MLP, SVM, Decision
tree, Gradient
boosting classifiers,
LR, RF, CNN, LSTM
Association rule
mining, Clustering
methods, Regression
models
Semi supervised
learning
Fig. 2: Different techniques for electricity theft detection
Gradient boosting classifiers: Boosting is an ensemble
technique where, several weak learners are combined to form
a strong learner. There are many boosting techniques: random
forest, adaptive boosting, gradient boosting, etc. All of these
have different mechanisms to reduce the loss function. Gra-
dient boosting algorithms use gradient descent to reduce the
loss function. XGBoost is based on sequential learning, where
weak learners are trained through parallel implementation, that
increases the algorithm performance. It is designed in this
way to utilize maximum hardware resources. Cache and hard
drive are utilized efficiently to handle small and large datasets.
Besides, it used a weighted sketch algorithm to find optimal
splitting criteria. CatBoost is a latest member of the gradient
family toolkit. It is developed by the machine learning team
at Yandex. It has ability to handle the categorical features and
uses an order boosting strategy to avoid information leakage.
CatBoost uses an oblivious decision tree. These trees have
equal splitting criteria at each level and are less prone to
overfitting.
Support vector machine: SVM is a well-known classifier
in ETD. It is an extension of the maximal margin hyperplane.
In SVM, training data is fed into a classifier and results
are predicted through testing data. It uses kernel functions
to transform data into higher dimensions, where hyperplane
can be drawn to separate classes. Polynomial and radial base
kernels are used to handle non-linear data. Whereas, a linear
kernel is used to draw a decision boundary between classes of
linear data. In [2], the authors use SVM to capture abnormal
patterns from electricity consumption data.
Random forest: Random forest (RF) is an ensemble learn-
ing technique where multiple decision trees are integrated to
give a final decision through the majority of wisdom. It is
widely used for classification and regression tasks. Moreover,
it reduces the overfitting problem through multiple decision
trees mechanism. However, the main limitation of random
forest is that it contains a large number of decision trees which
makes it less efficient for real-world problems.
k-nearest neighbours algorithm: KNN is a simple and
easy to implement supervised machine learning algorithm. It
is used for both classification and regression tasks. However,
it is mostly used for classification tasks in the industry. KNN
assumes the concept of similarity measure to handle the
classification and regression problems. It is a lazy and non-
parametric learning algorithm. Its computation time, memory
and accuracy depend upon the nature of data. In [8], authors
use KNN with SVM to reduce the misclassification of data
points that are near the decision boundary.
Multilayer perceptron: Artificial neural network or multi-
layer perceptron is the first biological-inspired machine learn-
ing algorithm. It contains input, hidden and output layers.
It is successfully used for NTL detection in [1] and [2]. It
extracts the latent information from consumer’s consumption
to differentiate between normal and theft samples.
Long short term memory: LSTM networks are variants
of recurrent neural networks (RNN). Smart meter provides
historical data of any customer,, which can be yearly long
and increases on daily. RNN networks are unable to detect
abnormal patterns from long sequence data due to vanishing
and exploding gradient problems. LSTM is an enhanced
version of the RNN network, which solves these problem.
We use LSTM to capture longer patterns from the user’s
historical data. LSTM structure is similar to the RNN network
but different in internal components. The important part of
LSTM is its cell state, which acts as a transport highway that
transfers the relative pattern to way down sequence chain. It
has three gates to regulate the flow of information throughout
the network. Forget gates decide which information to keep
or discard from the cell state. Input gates use the sigmoid and
tanh function to update the cell state. The output gate predicts
the final output. It also decides the hidden state. All equations
5
TABLE I: Confusion matrix of all classifiers
Type of classifiers True positive True negative False positive False negative
XGBoost 58 379 6 7
CatBoost 54 367 10 19
RF 57 289 7 97
SVM-linear 50 331 14 55
KNN 63 340 1 40
MLP 54 338 10 48
LSTM 51 377 13 9
TABLE II: Precision, recall, F1-score and FPR of all classifiers
Type of classifiers Precision Recall F1-measure FPR
XGBoost 0.983 0.984 0.981 0.015
CatBoost 0.973 0.950 0.961 0.026
RF 0.976 0.748 0.847 0.023
SVM-linear 0.959 0.857 0.905 0.04
KNN 0.880 0.997 0.935 0.002
MLP 0.971 0.875 0.920 0.028
LSTM 0.966 0.976 0.971 0.033
of the LSTM network are given below [15].
ft=σ(wf(ht1, xt),+bf)(13)
it=σ(wi(ht1, xt),+bi)tanh(wi(ht1, xt),+bi)(14)
Ct=ftCt1+it(15)
ot=σ(wo(ht1, xt),+bo)(16)
ht=ottanh(Ct)(17)
ft,itand otrepresent forget gate, input gate and output gate,
respectively. xt,wf,wiand wodenote current input, forget gate
weight, input gate weight and output gate weight. Ctand ht
represent current cell state and hidden state, respectively. Ct-1
and ht-1 denote previous cell state and hidden state. bf,biand
borepresent forget gate bias, input gate bias and output gate
bias. denotes point wise multiplication and σrepresents
sigmoid function. tanh function squashes the values between
-1 and +1.
IV. EXPERIMENTAL RESULTS
In this section, we evaluate the performance of all clas-
sifiers by performing extensive experiments. All of these are
implemented on Google Colab that is an open-source platform.
It is mostly used for machine learning analysis, by taking
advantage of distributed computing. SVM-linear, KNN and RF
are implemented through Scikit learn library. While LSTM
and MLP are trained and evaluated through the TensorFlow
library. LighBoost and CatBoost are also open source libraries
that are available on GitHub. One of contribution is that
we compare the performance of different classifiers and give
detailed analysis by utilizing different performance measures:
precision, recall, F1-score and FPR. Another contribution is
that we compare the efficiency of recently developed state-of-
the-art classifiers: XGBoost and CatBoost with conventional
machine learning and deep learning models. Table I shows the
confusion matrix of all implemented classifiers.
Precision, recall, F1-score and FPR are presented in Table
II. It is interesting to consider F1-score as a performance
measure to evaluate the classifiers’ performance. XGBoost
has the highest F1-measure value than CatBoost and RF. The
counterpart is that RF also belongs to the group of ensemble
learning but it has the lowest performance. KNN and SVM-
linear obtain F1-measure value 0.935 and 0.905, respectively.
KNN gives good results which means that normal and theft
classes are easily separable. LSTM and MLP classifiers belong
to the deep learning class. However, LSTM attains a 0.976 F1-
measure value that is more than MLP because it has memory
cells to remember the consumption patterns of theft and nor-
mal consumers. XGBoost gives the highest performance than
all other classifiers because it performs sequential learning
and reduces the misclassification rate by utilizing a gradient
descent algorithm.
(a) XGBoost, CatBoost, SVM-linear
(b) MLP, LSTM, RF, KNN
Fig. 3: ROC curve of all classifiers
Taking precision as an efficiency metric, ensemble classi-
fiers achieve high precision as compared to conventional and
deep learning models. XGBoost attains 0.983 precision value
that is more than all implemented classifiers. MLP attains
0.971 which is better than the LSTM precision value. Now,
we take recall as a performance measure metric. It is a ratio of
relevant results that are returned by a classifier. KNN obtains
the highest recall while MLP achieves the lowest value. FPR
is also known as misclassification rate. It is very important
for electric utilities because they have limited resources for
6
on-site inspection. KNN gives the lowest FPR value than
all other implemented classifiers. However, one drawback of
KNN is that it belongs to group of lazy learning classifiers.
These classifiers give good results on small datasets, while
their performance is drastically decreased on larger datasets.
Reciever operating characteristics curve (ROC Curve) is a
tool that is commonly exploited to access the performance of
machine learning models. It is ratio of true positive rate and
false positive rate on different threshold values between 0 and
1. Fig 3 shows ROC curve of all models to better evaluate
(a) XGBoost, CatBoost, SVM-linear
(b) MLP, LSTM, RF, KNN
Fig. 4: PR curve of all classifiers
their performance. XGBoost outperforms all classifiers and
achieves the highest ROC curve. While RF gives the lowest
performance. The remaining classifiers CatBoost, SVM-linear,
MLP, LSTM and KNN attain 0.971, 0.884, 0.914, and 0.965
AUC values of ROC curve, respectively. Precision recall curve
(PR curve) is another measure that is used to access the
classifier performance for imbalanced datasets. It is ratio of
precision and recall at different threshold[0 and 1]. The AUC
of PR curve gives a summary of the skilled classifiers.
PR curve of implemented classifiers are represented in
Fig. 4 to check their performance on the imbalanced dataset.
XGBoost outperforms all classifiers and achieves 0.997 AUC
value. While RF achieves the lowest AUC value. CatBoost,
SVM, MLP, LSTM and KNN get 0.995, 0.975, 0.995 and
0.994 AUC, respectively. In this research article, we manually
analyze the performance of conventional machine learning
and deep learning classifiers for NTL detection. However,
XGBoost outperforms the classifiers and achieves the highest
results. It has a built-in feature extraction module that removes
the noise, extracts optimal features and improves performance.
Moreover, it performs sequential learning, where weak learners
are trained in a sequenced manner and are combined in the end
to make a strong learner. Due to all of these reasons, XGBoost
performs better than all other tested classifiers.
V. CONCLUSION
In this manuscript, we exploit different supervised learning
methods: XGBoost, CatBoost, SVM, RF, KNN, MLP and
LSTM to detect anomalies in users’ consumption history. For a
consumer in a smart grid, we can easily obtain benign samples
from his consumption history. However, theft cases may not be
presented in consumption history. We use six cases to generate
malicious samples and argue that the purpose of theft is to
report less consumption than actual electricity consumption.
ADASYN is utilized to balance the ratio between benign
and theft samples. A realistic electricity consumption dataset
is utilized to train and evaluate all implemented classifiers,
provided by the electric utility of Pakistan. XGBoost outper-
forms and achieves 0.986 and 0.997 ROC-AUC and PR-AUC,
respectively. It has a built-in feature extraction module that
reduces noise, selects the optimal features and increases its
performance. Moreover, precision, recall, F1-score and FPR
is utilized to evaluate their performance of the classifiers.
REFERENCES
[1] Avila, Nelson Fabian, Gerardo Figueroa, and Chia-Chi Chu. ”NTL
detection in electric distribution systems using the maximal overlap
discrete wavelet-packet transform and random undersampling boosting.
IEEE Transactions on Power Systems 33, no. 6 (2018): 7171-7180.
[2] Jokar, Paria, Nasim Arianpoo, and Victor CM Leung. ”Electricity
theft detection in AMI using customers’ consumption patterns.” IEEE
Transactions on Smart Grid 7, no. 1 (2015): 216-226.
[3] Punmiya, Rajiv, and Sangho Choe. ”Energy theft detection using gra-
dient boosting theft detector with feature engineering-based preprocess-
ing.” IEEE Transactions on Smart Grid 10, no. 2 (2019): 2326-2329.
[4] Khan, Zahoor Ali, Muhammad Adil, Nadeem Javaid, Malik Najmus
Saqib, Muhammad Shafiq, and Jin-Ghoo Choi. ”Electricity theft de-
tection using supervised learning techniques on smart meter data.”
Sustainability 12, no. 19 (2020): 8023.
[5] Arif, Arooj, Nadeem Javaid, Abdulaziz Aldegheishem, and Nabil Alra-
jeh. ”Big Data Analytics for Identifying Electricity Theft using Machine
Learning Approaches in Micro Grids for Smart Communities.”, Concur-
rency and Computation: Practice and Experience (2021): 1532-0634.
[6] Ghori, Khawaja Moyeezullah, Rabeeh Ayaz Abbasi, Muhammad Awais,
Muhammad Imran, Ata Ullah, and Laszlo Szathmary. ”Performance
analysis of different types of machine learning classifiers for non-
technical loss detection.” IEEE Access 8 (2019): 16033-16048.
[7] Razavi, Rouzbeh, Amin Gharipour, Martin Fleury, and Ikpe Justice
Akpan. ”A practical feature-engineering framework for electricity theft
detection in smart grids.” Applied energy 238 (2019): 481-494.
[8] Kong, Xiangyu, Xin Zhao, Chao Liu, Qiushuo Li, DeLong Dong, and Ye
Li. ”Electricity theft detection in low-voltage stations based on similarity
measure and DT-KSVM.” International Journal of Electrical Power &
Energy Systems 125 (2021): 106544.
[9] Buzau, Madalina Mihaela, Javier Tejedor-Aguilera, Pedro Cruz-Romero,
and Antonio G´
omez-Exp´
osito. ”Detection of non-technical losses using
smart meter data and supervised learning.” IEEE Transactions on Smart
Grid 10, no. 3 (2018): 2661-2670.
7
[10] Buzau, Madalina-Mihaela, Javier Tejedor-Aguilera, Pedro Cruz-Romero,
and Antonio G´
omez-Exp´
osito. ”Hybrid deep neural networks for detec-
tion of non-technical losses in electricity smart meters.” IEEE Transac-
tions on Power Systems 35, no. 2 (2019): 1254-1263.
[11] Zheng, Zibin, Yatao Yang, Xiangdong Niu, Hong-Ning Dai, and Yuren
Zhou. ”Wide and deep convolutional neural networks for electricity-
theft detection to secure smart grids.” IEEE Transactions on Industrial
Informatics 14, no. 4 (2017): 1606-1615.
[12] Huang, Yifan, and Qifeng Xu. ”Electricity theft detection based on
stacked sparse denoising autoencoder.” International Journal of Electrical
Power & Energy Systems 125 (2021): 106448.
[13] Fenza, Giuseppe, Mariacristina Gallo, and Vincenzo Loia. ”Drift-aware
methodology for anomaly detection in smart grid.” IEEE Access 7
(2019): 9645-9657.
[14] Bhat, Rajendra Rana, Rodrigo Daniel Trevizan, Rahul Sengupta, Xi-
aolin Li, and Arturo Bretas. ”Identifying nontechnical power loss via
spatial and temporal deep learning.” In 2016 15th IEEE International
Conference on Machine Learning and Applications (2016) 272-279.
[15] Hasan, Md, Rafia Nishat Toma, Abdullah-Al Nahid, M. M. Islam, and
Jong-Myon Kim. ”Electricity theft detection in smart grid systems: A
CNN-LSTM based approach.” Energies 12, no. 17 (2019): 3310.
[16] Ramos, Caio CO, Douglas Rodrigues, Andr´
e N. de Souza, and Jo˜
ao P.
Papa. ”On the study of commercial losses in Brazil: a binary black hole
algorithm for theft characterization.” IEEE Transactions on Smart Grid
9, no. 2 (2016): 676-683.
[17] Coma-Puig, Bernat, and Josep Carmona. ”Bridging the gap between en-
ergy consumption and distribution through non-technical loss detection.
Energies 12, no. 9 (2019): 1748.
[18] Hu, Tianyu, Qinglai Guo, Hongbin Sun, Tian-En Huang, and Jian Lan.
”Nontechnical losses detection through coordinated biwgan and svdd.”
IEEE Transactions on Neural Networks and Learning Systems (2020):
2162-237.
[19] He, Haibo, Yang Bai, Edwardo A. Garcia, and Shutao Li. ”ADASYN:
Adaptive synthetic sampling approach for imbalanced learning.” In 2008
IEEE international joint conference on neural networks (2008): 1322-
1328..
[20] Javaid, Nadeem, Naeem Jan, and Muhammad Umar Javed. ”An adaptive
synthesis to handle imbalanced big data with deep siamese network
for electricity theft detection in smart grids.” Journal of Parallel and
Distributed Computing (2021): 0743-7315.
Research Proposal
Full-text available
In this synopsis, the first solution introduces a hybrid deep learning model, which tackles the class imbalance problem and curse of dimensionality and low detection rate of existing models. The proposed model integrates benefits of both GoogLeNet and gated recurrent unit. The one dimensional EC data is fed into GRU to remember periodic patterns. Whereas, GoogLeNet model is leveraged to extract latent features from the two dimensional weekly stacked EC data. Furthermore , the time least square generative adversarial network is proposed to solve the class imbalance problem. The second solution presents a framework, which is employed to solve the curse of dimensionality issue. In literature, the existing studies are mostly concerned with tuning the hyperparameters of ML/ DL methods for efficient detection of NTL. Some of them focus on the selection of prominent features from data to improve the performance of electricity theft detection. However, the curse of dimensionality affects the generalization ability of ML/ DL classifiers and leads to computational, storage and overfitting problems. Therefore, to deal with above-mentioned issues, this study proposes a system based on metaheuristic techniques (artificial bee colony and genetic algorithm) and denoising autoencoder for electricity theft detection using big data in electric power systems. The third solution introduces a hybrid deep learning model for prediction of upwards and downwards trends in financial market data. The financial market exhibits complex and volatile behavior that is difficult to predict using conventional machine learning (ML) and statistical methods, as well as shallow neural networks. Its behavior depends on many factors such as political upheavals , investor sentiment, interest rates, government policies, natural disasters, etc. However, it is possible to predict upward and downward trends in financial market behavior using complex DL models. In this synopsis, we have proposed three solutions to solve different issues in smart grids and financial market. The validations of proposed solutions will be done in thesis work using real-world datasets.
Thesis
Full-text available
Data science is an emerging field, which has applications in multiple disciplines; like healthcare, advanced image recognition, airline route planning, augmented reality, targeted advertising, etc. In this thesis, we have exploited its applications in smart grids and financial markets with three major contributions. In the first two contributions, machine learning (ML) and deep learning (DL) models are utilized to detect anomalies in electricity consumption (EC) data, while in third contribution, upwards and downwards trends in the financial markets are predicted to give benefits to the potential investors. Non-technical losses (NTLs) are one of the major causes of revenue losses for electric utilities. In the literature, various ML and DL approaches are employed to detect NTLs. The first solution introduces a hybrid DL model, which tackles the class imbalance problem and curse of dimensionality and low detection rate of existing models. The proposed model integrates benefits of both GoogLeNet and gated recurrent unit (GRU). The one dimensional EC data is fed into GRU to remember periodic patterns. Whereas, GoogLeNet model is leveraged to extract latent features from the two dimensional weekly stacked EC data. Furthermore, the time least square generative adversarial network (TLSGAN) is proposed to solve the class imbalance problem. The TLSGAN uses unsupervised and supervised loss functions to generate fake theft samples, which have high resemblance with real world theft samples. The standard generative adversarial network only updates the weights of those points that are available at the wrong side of the decision boundary. Whereas, TLSGAN even modifies the weights of those points that are available at the correct side of decision boundary, which prevent the model from vanishing gradient problem. Moreover, dropout and batch normalization layers are utilized to enhance model’s convergence speed and generalization ability. The proposed model is compared with different state-of-the-art classifiers including multilayer perceptron (MLP), support vector machine, naive bayes, logistic regression, MLP-long short term memory network and wide and deep convolutional neural network. The second solution presents a framework, which is employed to solve the curse of dimensionality issue. In literature, the existing studies are mostly concerned with tuning the hyperparameters of ML/ DL methods for efficient detection of NTL, i.e., electricity theft detection. Some of them focus on the selection of prominent features from data to improve the performance of electricity theft detection. However, the curse of dimensionality affects the generalization ability of ML/ DL classifiers and leads to computational, storage and overfitting problems. Therefore, to deal with above-mentioned issues, this study proposes a system based on metaheuristic techniques (artificial bee colony and genetic algorithm) and denoising autoencoder for electricity theft detecton using big data in electric power systems. The former (metaheuristics) are used to select prominent features. While the latter are utilized to extract high variance features from electricity consumption data. First, new features are synthesized from statistical and electrical parameters from the user’s consumption history. Then, the synthesized features are used as input to metaheuristic techniques to find a subset of optimal features. Finally, the optimal features are fed as input to the denoising autoencoder to extract features with high variance. The ability of both techniques to select and extract features is measured using a support vector machine. The proposed system reduces the overfitting, storage and computational overhead of ML classifiers. Moreover, we perform several experiments to verify the effectiveness of our proposed system and results reveal that the proposed system has higher performance our counterparts. The third solution introduces a hybrid DL model for prediction of upwards and downwards trends in financial market data. The financial market exhibits complex and volatile behavior that is difficult to predict using conventional ML and statistical methods, as well as shallow neural networks. Its behavior depends on many factors such as political upheavals, investor sentiment, interest rates, government policies, natural disasters, etc. However, it is possible to predict upward and downward trends in financial market behavior using complex DL models. This paper therefore addresses the following limitations that adversely affect the performance of existing ML and DL models, i.e., the curse of dimensionality, the low accuracy of the standalone models, and the inability to learn complex patterns from high-frequency time series data. The denoising autoencoder is used to reduce the high dimensionality of the data, overcoming the problem of overfitting and reducing the training time of the ML and DL models. Moreover, a hybrid DL model HRG is proposed based on a ResNet module and gated recurrent units. The former is used to extract latent or abstract patterns that are not visible to the human eye, while the latter retrieves temporal patterns from the financial market dataset. Thus, HRG integrates the advantages of both models. It is evaluated on real-world financial market datasets obtained from IBM, APPL, BA and WMT . Also, various performance indicators such as f1-score, accuracy, precision, recall, receiver operating characteristic-area under the curve (ROC-AUC) are used to check the performance of the proposed and benchmark models. The RG 2 achieves 0.95, 0.90, 0.82 and 0.80 ROC-AUC values on APPL, IBM, BA and WMT datasets respectively, which are higher than the ROC-AUC values of all implemented ML and DL models.
Article
Full-text available
The bi-directional flow of energy and information in the smart grid makes it possible to record and analyze the electricity consumption profiles of consumers. Because of the increasing rate of inflation over the past few years, people started looking for means to use electricity illegally, termed as electricity theft. Many data analytics techniques are proposed in the literature for electricity theft detection (ETD). These techniques help in the detection of suspected illegal consumers. However, the existing approaches have a low ETD rate either due to improper handling of the imbalanced class problem in a dataset or the selection of inappropriate classifier. In this paper, a robust big data analytics technique is proposed to resolve the aforementioned concerns. Firstly, adaptive synthesis (ADASYN) is applied to handle the imbalanced class problem of data. Secondly, convolutional neural network (CNN) and long-short term memory (LSTM) integrated deep siamese network (DSN) is proposed to discriminate the features of both honest and fraudulent consumers. Specifically, the task of feature extraction from weekly energy consumption profiles is handed over to the CNN module while the LSTM module performs the sequence learning. Finally, the DSN contemplates on the shared features provided by the CNN-LSTM and applies final judgment. The data analytics is performed on different train-test ratios of the real-time smart meters' data. The simulation results validate the proposed model's effectiveness in terms of high area under the curve, F1-Score, precision and recall.
Article
Full-text available
Due to the increase in the number of electricity thieves, the electric utilities are facing problems in providing electricity to their consumers in an efficient way. An accurate Electricity Theft Detection (ETD) is quite challenging due to the inaccurate classification on the imbalance electricity consumption data, the overfitting issues and the High False Positive Rate (FPR) of the existing techniques. Therefore, intensified research is needed to accurately detect the electricity thieves and to recover a huge revenue loss for utility companies. To address the above limitations, this paper presents a new model, which is based on the supervised machine learning techniques and real electricity consumption data. Initially, the electricity data are pre-processed using interpolation, three sigma rule and normalization methods. Since the distribution of labels in the electricity consumption data is imbalanced, an Adasyn algorithm is utilized to address this class imbalance problem. It is used to achieve two objectives. Firstly, it intelligently increases the minority class samples in the data. Secondly, it prevents the model from being biased towards the majority class samples. Afterwards, the balanced data are fed into a Visual Geometry Group (VGG-16) module to detect abnormal patterns in electricity consumption. Finally, a Firefly Algorithm based Extreme Gradient Boosting (FA-XGBoost) technique is exploited for classification. The simulations are conducted to show the performance of our proposed model. Moreover, the state-of-the-art methods are also implemented for comparative analysis, i.e., Support Vector Machine (SVM), Convolution Neural Network (CNN), and Logistic Regression (LR). For validation, precision, recall, F1-score, Matthews Correlation Coefficient (MCC), Receiving Operating Characteristics Area Under Curve (ROC-AUC), and Precision Recall Area Under Curve (PR-AUC) metrics are used. Firstly, the simulation results show that the proposed Adasyn method has improved the performance of FA-XGboost classifier, which has achieved F1-score, precision, and recall of 93.7%, 92.6%, and 97%, respectively. Secondly, the VGG-16 module achieved a higher generalized performance by securing accuracy of 87.2% and 83.5% on training and testing data, respectively. Thirdly, the proposed FA-XGBoost has correctly identified actual electricity thieves, i.e., recall of 97%. Moreover, our model is superior to the other state-of-the-art models in terms of handling the large time series data and accurate classification. These models can be efficiently applied by the utility companies using the real electricity consumption data to identify the electricity thieves and overcome the major revenue losses in power sector.
Article
Full-text available
With the ever-growing demand of electric power, it is quite challenging to detect and prevent Non-Technical Loss (NTL) in power industries. NTL is committed by meter bypassing, hooking from the main lines, reversing and tampering the meters. Manual on-site checking and reporting of NTL remains an unattractive strategy due to the required manpower and associated cost. The use of machine learning classifiers has been an attractive option for NTL detection. It enhances data-oriented analysis and high hit ratio along with less cost and manpower requirements. However, there is still a need to explore the results across multiple types of classifiers on a real-world dataset. This paper considers a real dataset from a power supply company in Pakistan to identify NTL. We have evaluated 15 existing machine learning classifiers across 9 types which also include the recently developed CatBoost, LGBoost and XGBoost classifiers. Our work is validated using extensive simulations. Results elucidate that ensemble methods and Artificial Neural Network (ANN) outperform the other types of classifiers for NTL detection in our real dataset. Moreover, we have also derived a procedure to identify the top-14 features out of a total of 71 features, which are contributing 77% in predicting NTL. We conclude that including more features beyond this threshold does not improve performance and thus limiting to the selected feature set reduces the computation time required by the classifiers. Last but not least, the paper also analyzes the results of the classifiers with respect to their types, which has opened a new area of research in NTL detection.
Article
Full-text available
Among an electricity provider's non-technical losses, electricity theft has the most severe and dangerous effects. Fraudulent electricity consumption decreases the supply quality, increases generation load, causes legitimate consumers to pay excessive electricity bills, and affects the overall economy. The adaptation of smart grids can significantly reduce this loss through data analysis techniques. The smart grid infrastructure generates a massive amount of data, including the power consumption of individual users. Utilizing this data, machine learning and deep learning techniques can accurately identify electricity theft users. In this paper, an electricity theft detection system is proposed based on a combination of a convolutional neural network (CNN) and a long short-term memory (LSTM) architecture. CNN is a widely used technique that automates feature extraction and the classification process. Since the power consumption signature is time-series data, we were led to build a CNN-based LSTM (CNN-LSTM) model for smart grid data classification. In this work, a novel data pre-processing algorithm was also implemented to compute the missing instances in the dataset, based on the local values relative to the missing data point. Furthermore, in this dataset, the count of electricity theft users was relatively low, which could have made the model inefficient at identifying theft users. This class imbalance scenario was addressed through synthetic data generation. Finally, the results obtained indicate the proposed scheme can classify both the majority class (normal users) and the minority class (electricity theft users) with good accuracy.
Article
Full-text available
The application of Artificial Intelligence techniques in industry equips companies with new essential tools to improve their principal processes. This is especially true for energy companies, as they have the opportunity, thanks to the modernization of their installations, to exploit a large amount of data with smart algorithms. In this work we explore the possibilities that exist in the implementation of Machine-Learning techniques for the detection of Non-Technical Losses in customers. The analysis is based on the work done in collaboration with an international energy distribution company. We report on how the success in detecting Non-Technical Losses can help the company to better control the energy provided to their customers, avoiding a misuse and hence improving the sustainability of the service that the company provides.
Article
Electricity Theft (ET) causes major revenue loss in power utilities. It reduces the quality of supply, raises production cost, causes legal consumers to pay the higher cost and impacts the economy as a whole. In this paper, we use the State Grid Cooperation of China (SGCC) dataset, which contains electricity consumption data of 1035 days for two classes: normal and fraudulent. In this work, Electricity Theft Detection (ETD) model is proposed that consists of four steps: interpolation, data balancing, feature extraction and classification. Firstly, missing values of the dataset are recovered using the interpolation method. Secondly, resampling technique is implemented. ET consumers are 9% in the SGCC dataset that make the model inefficient to correctly classify both classes (normal and theft). A hybrid resampling technique is proposed, named Synthetic Minority Oversampling Technique with Near Miss (SMOTE-NM). Thirdly, Residual Network (ResNet) extracts the latent features from the SGCC dataset. Fourthly, three tree based classifiers, such as Decision Tree (DT), Random Forest (RF) and Adaptive Boosting (AdaBoost) are applied to train the encoded feature vectors for classification. Besides, search for good hyperparameters is a challenging task, which is usually done manually and takes a considerable amount of time. To resolve this problem, Bayesian optimizer is used to simplify the tuning process of DT, RF and AdaBoost. Finally, the results indicate that RF outperforms DT and AdaBoost.
Article
The theft of electricity affects power supply quality and safety of grid operation, and non-technical losses (NTL) have become the major reason of unfair power supply and economic losses for power companies. For more effective electricity theft inspection, an electricity theft detection method based on similarity measure and decision tree combined K-Nearest Neighbor and support vector machine (DT-KSVM) is proposed in the paper. Firstly, the condensed feature set is devised based on feature selection strategy, typical power consumption characteristic curves of users are obtained based on kernel fuzzy C-means algorithm (KFCM). Next, to solve the problem of lack of stealing data and realize the reasonable use of advanced metering infrastructure (AMI). One dimensional Wasserstein generative adversarial networks (1D-WGAN) is used to generate more simulated stealing data. Then the numerical and morphological features in the similarity measurement process are comprehensively considered to conduct preliminary detection of NTL. And DT-KSVM is used to perform secondary detection and identify suspicious customers. At last, simulation experiments verify the effectiveness of the proposed method.
Article
Inspired by the powerful feature extraction and the data reconstruction ability of autoencoder, a stacked sparse denoising autoencoder is developed for electricity theft detection in this paper. The technical route is to employ the electricity data from honest users as the training samples, and the autoencoder can learn the effective features from the data and then reconstruct the inputs as much as possible. For the anomalous behavior, since it contributes little to the autoencoder, the detector returns to a comparatively higher reconstruction error; hence the theft users can be recognized by setting an appropriate error threshold. To improve the feature extraction ability and the robustness, the sparsity and noise are introduced into the autoencoder, and the particle swarm optimization algorithm is applied to optimize these hyper-parameters. Moreover, the receiver operating characteristic curve is put forward to estimate the optimal error threshold. Finally, the proposed approach is evaluated and verified using the electricity dataset in Fujian, China.
Article
Nontechnical losses (NTLs) are estimated to be considerable and increasing every year. Recently, high-resolution measurements from globally laid smart meters have brought deeper insights on users' consumption patterns that can be exploited potentially by NTL detection. However, consumption-pattern-based NTL detection is now facing two major challenges: the inefficiency of harnessing high dimensionality and the severe lack of fraudulent samples. To overcome them, an NTL detection model based on deep learning and anomaly detection is proposed in this article, namely bidirectional Wasserstein GAN and support vector data description-based NTL detector (BSBND). Motivated by the powerful ability of generative adversarial networks (GANs) to learn deep representation from high-dimensional distributions of data, in the BSBND, we utilized a BiWGAN for feature extraction from high-dimensional raw consumption records, and a one-class classifier trained only on benign samples--SVDD--is adopted to map features into judgments. Moreover, a novel alternate coordinating algorithm is proposed to optimize the cooperation between the upstream BiWGAN and the downstream SVDD, and also, an interpreting algorithm is proposed to visualize the basis of each fraudulent judgment. Case studies have demonstrated the superiority of the BSBND over the state of the arts, the powerful feature extraction ability of BiWGAN, and also the effectiveness of the proposed coordinating and interpreting algorithms.
Article
Non-technical losses in electricity utilities are responsible for major revenue losses. In this paper, we propose a novel end-to-end solution to self-learn the features for detecting anomalies and frauds in smart meters using a hybrid deep neural network. The network is fed with simple raw data, removing the need of handcrafted feature engineering. The proposed architecture consists of a long short-term memory network and a multi-layer perceptrons network. The first network analyses the raw daily energy consumption history whilst the second one integrates non-sequential data such as its contracted power or geographical information. The results show that the hybrid neural network significantly outperforms state-of-the-art classifiers as well as previous deep learning models used in non-technical losses detection. The model has been trained and tested with real smart meter data of Endesa, the largest electricity utility in Spain.