Content uploaded by Nadeem Javaid
Author content
All content in this area was uploaded by Nadeem Javaid on May 19, 2021
Content may be subject to copyright.
1
Comparative Study of Data Driven Approaches
towards Efficient Electricity Theft Detection in
Micro Grids
Faisal Shehzad1, Muhammad Asif1, Zeeshan Aslam2,
Shahzaib Anwar3, Hamza Rashid4, Muhammad Ilyas3, Nadeem Javaid1,∗
1Department of Computer Science, COMSATS University Islamabad, Islamabad, Pakistan
2Department of Computer Science, Bahria University Islamabad, Islamabad, Pakistan
3Department of Electrical & Computer Engineering, COMSATS University Islamabad, Islamabad, Pakistan
4University of Management & Technology, Lahore, Pakistan
Email: faisalaslam156@gmail.com, muhammad.asif.comsat@gmail.com, zeeshanxh@gmail.com,
shazaibanwar1@gmail.com, hamzarashid84@gmail.com, milyas3731@gmail.com,
∗Corresponding author: nadeemjavaidqau@gmail.com; www.njavaid.com.
Abstract—In this research article, we tackle the following
limitations: high misclassification rate, low detection rate and,
class imbalance problem and no availability of malicious or theft
samples. The class imbalanced problem is severe issue in elec-
tricity theft detection that affects the performance of supervised
learning methods. We exploit the adaptive synthetic minority
oversampling technique to tackle this problem. Moreover, theft
samples are created from benign samples and we argue that
the goal of theft is to report less than consumption actual
electricity consumption. Different machine learning and deep
learning methods including recently developed light and extreme
gradient boosting (XGBoost), are trained and evaluated on a
realistic electricity consumption dataset that is provided by an
electric utility in Pakistan. The consumers in the dataset belong
to different demographics and, different social and financial
backgrounds. Different number of classifiers are trained on
acquired data; however, long short-term memory (LSTM) and
XGBoost attain high performance and outperform all classi-
fiers. The XGBoost achieves a 0.981 detection rate and 0.015
misclassification rate. Whereas, LSTM attains 0.976 and 0.033
detection and misclassification rate, respectively. Moreover, the
performance of all implemented classifiers is evaluated through
precision, recall, F1-score, etc.
Index Terms—Data driven approaches, ADASYN, ETD, NTL,
Machine learning, Deep learning, FPR, TPR.
I. BACK GROU ND S TU DY
The objective of electricity theft detection (ETD) is to
detect electricity losses in a smart grid. There are two types
of losses: technical losses and non-technical losses (NTL).
Former occurs due to energy dissipation in transmission lines,
transformers and other types of electric equipment. The latter
happen due to illegal activities: meter tampering, meter by-
passing, using shunt devices, etc., [1]. In literature, different
approaches are designed to detect NTL in the smart grid:
game theory, hardware-based and data-driven approaches. In
the following articles, the authors use multiple data-driven
approaches to analyse users’ consumption and extract the
abnormal patterns. Joker et al. [2] suggest a consumption
pattern-based electricity theft detector to detect abnormal
patterns. Irish electricity dataset is used to train the classifier,
that contains information about only benign1consumers. The
authors propose six theft cases to generate malicious samples
and argue that abnormal electricity consumption is always
less than the normal consumption. However, they use the
oversampling technique to handle the class imbalanced ratio. It
generates duplicated copies of minority class samples, which
affect the learning rate of the classifier. Rajiv et al. [3] present
a comparison between gradient boosting classifiers: eXtreme
gradient boosting (XGBoost), light gradient boosting (Light-
GBM) and categorical boosting (CatBoost), and support vector
machine (SVM). Irish smart meter data is used that contains
5% to 8% theft samples. New theft samples are generated
through the existing six theft cases, which are described in
[2] to balance the dataset. In [4]–[6], authors use supervised
machine learning techniques to detect unusual consumption
behaviour from electricity consumption dataset.
In [7], Razavi et al. develop an agnostic model for feature
extraction through finite mixture model and genetic evolu-
tionary algorithm. Xiangyu et al. [8] propose a combined
supervised and unsupervised-based method to detect illegal
consumers. Dynamic time wrapping similarity measure is
used for preliminary detection and then data is passed to
hybrid SVM and K-nearest neighbour (KNN) based model to
differentiate between normal and abnormal patterns. To tackle
the class imbalance problem, synthesized electricity theft pat-
terns are generated through Wasserstein generative adversarial
network. Finally, features are selected and extracted through
clustering evaluation criteria and autoencoder, respectively.
Madalina et al. [9] use XGBoost, electricity consumption
data and smart meter information to detect NTL in Spain.
K-means clustering, euclidean and Manhattan distance are
utilized to generate new features. Recent and old anomalies are
1Benign and normal samples are used alternatively.
2
captured through z-score and local outlier factor, respectively.
The authors train several ML algorithms with new generated
features, smart meter data and compare their results.
Buzau et al. [10] propose a hybrid model, which is a
combination of long short term memory (LSTM) and mul-
tilayer perceptron (MLP). Former is validated and tested on
sequential data. Whereas, latter is validated and tested on non-
sequential data. Dataset used in this article has high imbal-
anced nature, which tends the model towards the majority
class and generates false results. Zheng et al. [11] propose
a Wide and deep convolutional neural network to detect
electricity theft. Sequential data (weekly data) is passed to a
convolutional neural network (CNN) to capture local patterns.
While, 1D-data is fed in an MLP model (wide component)
to retrieve global patterns from electricity consumption data.
However, they do not handle class imbalance ratio, which
biases the classifier towards the majority class and increases
misclassification rate. Moreover, MLP model is designed for
tabular data, not for sequential data. In [12], Huang et al.
develop stacked sparse denoising autoencoder (SSDAE) to
capture abnormal patterns from electricity consumption data.
SSDAE extracts optimal features, compare them with original
ones and reduces reconstruction loss. The authors introduce
sparsity and noise to enhance the feature extraction ability
of the autoencoder and its hyperparameters are tuned using
the particle swarm optimization technique. SSDAE achieves
a 91.74% detection rate and 7.19% false positive rate (FPR),
which is better than state-of-the-art classifiers. Fenza et al. [13]
introduce an idea, which is based on drift concept and time-
aware system. Their proposed framework contains LSTM and
K-means clustering algorithms. K-means clustering algorithm
opts to find similar consumption behavior. LSTM is used
to predict user’s consumption and compare it with actual
consumption to detect the anomaly.
Rajendra et al. [14] use LSTM, CNN and stacked autoen-
coder to detect abnormal consumption patterns in distributed
systems. The authors use a synthesized dataset that contains
7% abnormal samples. CNN classifier outperforms the LSTM
and stacked autoencoder. Hassan et al. [15] propose CNN-
LSTM based hybrid model. CNN is employed to extract
the optimal features and LSTM is used for the classifica-
tion task. Dataset has an imbalance ratio, which is solved
through synthetic minority oversampling technique (SMOTE).
However, the authors do not consider the FPR. Since high
FPR is very costly because utilities have a limited budget
for on-site inspection. In [16], despite of extensive usage
of machine learning (ML) techniques, the authors do not
focus on the selection of optimal features. In [1], [17], the
authors give possibilities of implementing ML classifiers for
detection of NTL and describe the advantages of selecting
optimal features and their impacts on classifier performance.
One of the main challenges [18] that limits the classifica-
tion ability of existing methods is high dimensionality of
data. Moreover, smart meters greatly improve data collection
procedures and provide high dimensionality data to capture
complex patterns. However, research work shows that the most
existing classification methods are based on conventional ML
classifiers like ANN, SVM and decision tree, which have
limited generalization ability and are unable to learn complex
patterns in high dimensional data.
The list of contributions are explained below.
•The legitimate data of any consumer is collected through
consumption history. However, it is very difficult to attain
malicious samples because theft cases are rarely hap-
pened and may not be presented in the user’s consumption
history. So, we apply six theft cases to generate malicious
samples and argue that the motive of theft is to report less
consumption than actual usage.
•After generating malicious samples, adaptive synthetic
minority oversampling (ADASYN) technique is exploited
to tackle the class imbalanced problem. This problem
creates a biased model which leads to high FPR.
•We conduct an empirical study to compare the perfor-
mance of different machine learning and deep learning
models: XGBoost, CatBoost, SVM, KNN, MLP, LSTM
and RF.
•We train and test all classifiers through a realistic elec-
tricity dataset that is provided by PRECON. Moreover,
different measures such as confusion matrix, precision,
recall, F1-score, etc. are used to evaluate their perfor-
mance.
II. ACQUIRING DATASET AND HANDLING CLASS
IMBALANCED PROBLEM
PRECON2is the first kind of dataset that belongs to users
of a developing country. Data collection aims to understand
the electricity consumption behavior of users belonging to
different demographics, and different social and financial
backgrounds. Data is collected for the period of one year
and contains electricity consumption information of thirty-two
houses. The people who participated in this research have
installed smart meters in their houses and agreed to take
part in this research. So, it is a valid assumption that all
participants are honest consumers. The large variety of con-
sumers, measurement of long periods and online availability
make this dataset an excellent source for research on smart
grids. It contains thirty-two files with electricity measurement
after each second. We reduce the data granularity by taking
one sample after each half-hour because high granularity
requires high computation power and affects consumer privacy.
However, the dataset contains only honest consumers. Fig. 1a
shows the electricity consumption of a benign consumer.
For analysis of electricity consumption dataset, we require
the theft samples but theft may be completely absent in users’
consumption history. We solve the lack of theft samples issue
by generating malicious samples from benign samples because
goal of electricity theft is to report less consumption or shift
load from on-peak hours to off-peak hours. In [2], Joker et
al. introduced six cases to generate malicious samples using
benign ones. Description of these theft cases is given below. If
xtis the real consumption of a normal consumer (t∈[1-48]).
t1(xt) = αxtα=random(0.1,0.8) (1)
2PRECON: PAKISTAN RESIDENTIAL ELECTRICITY CONSUMP-
TION DATASET
3
(a) Daily consumption of a normal consumer
(b) Consumption behaviour of a theft consumer
t2(xt) = βxtβ=random[0.1,0.8] (2)
t3(xt) = γtxtγt=random(0.1,0.8) (3)
t4(xt) = βmean(x)β=random(0.1,0.8) (4)
t5(xt) = mean(x)(5)
t6(xt) = x24−1(6)
t1(·)multiplies a random number between 0.1 and 0.8 with
meter reading and sends it to an electric utility. t2(·)sends
or does not send measurements at different intervals of time.
t3(·)multiplies each meter reading with a different random
number and reports lower consumption. t4(·)and t5(·)report
mean random factor or exact mean value of measurements.
t6(·)reverses the order of consumption and shifts load towards
peak hours. t5(·)and t6(·)launch attacks against load control
mechanism and report high consumption in off-peak hours and
low consumption in on-peak hours. We apply all these cases
on benign samples and generate the malicious samples. Fig.
1b shows the normal consumer consumption after applying the
six theft cases.
We have 321 normal consumption records and each theft
case generates 321 new theft patterns. The total number
of theft patterns is 1926, which are more than the normal
consumers. This situation creates the class imbalance ratio
that is a critical problem in ETD where one class (honest
consumers) is dominant to other class (theft consumers). Data
is not normally distributed and is skewed towards the majority
class. A machine learning model is applied on data imbalance
dataset. It would be biased towards majority class and not learn
importance features of the minority class, which increases the
FPR. Traditionally, two sampling techniques: undersampling
and oversampling are used to balance dataset. However, these
approaches are not adopted due to computational overhead,
information loss and duplication of existing data. In the
manuscript, we opt Adaptive Synthetic Sampling Approach
(ADASYN) method to address class imbalance ratio [19], [20].
ADASYN is baised on adaptive learning approach, where it
focuses on minority class samples, which are difficult to learn
rather those that are easy to learn. It can not only reduce bias
learning due to data imbalance distribution but also shifts class
boundary towards minority class that are difficult in learning.
Below paragraph shows working process of ADASYN.
Training data Dtr with m samples {xi,yi}where, i= 1...m, xi
belongs to X with n dimensions space. yi∈Y = [0,1] where, 0
and 1 are associated identity labels of xi.msand mlrepresent
the number of minority and majority class samples.
•Calculate the degree of imbalance ratio.
d=ms
ml
(7)
d∈(0,1]
d¡ds(dsis preset threshold for maximum to tolerate
minority class)
•Calculate the number of synthetic samples, which are
needed to generate for minority class.
G= (ml−ms)∗β(8)
β in [0,1]
βis parameter which, decides balance level between
minority and majority class samples. if β= 1, then data
imbalance problem is fully resolved.
•For xiin minority class, find KNN.
ri=δi
K, i = 1 ..., m (9)
δiis the number of samples that belong to the majority
class.
•Normalize ri
ri=ri
Pms
i=1 ri
(10)
xiis a density distribution function Pms
i=1 ri= 1
•Calculate the number of synthetic samples, which are
generated for each minority class xi
gi=ri∗G(11)
Gis total number of minority samples that needed to be
generated.
•For each xi, generate ginumber of sample using
following steps.
Do Loop 1 to gi:
4
Randomly selects one sample xzi from KNN of xi
si=xi+ (xi−xzi)∗λ(12)
End Loop
λ= [0,1]
siis a new minority class sample. It is added at end of
the dataset
The idea of ADASYN is based on density distribution function
in, which riautomatically gives higher weights to minority
class samples that are difficult in learning. This is a major
difference to SMOTE. SMOTE technique is an oversampling
techniques. It gives the equal weight to each minority class
sample.
III. STATE-OF-T HE -ART NTL DETECTION TECHNIQUES
There are different techniques that are used for NTL de-
tection: data-driven, game theory and hardware approaches.
Fig. 2 shows the flowchart of these approaches. We will
focus on supervised learning approaches that have applica-
tions in different fields like bioinformatics, object detection,
text classification, spam and anomaly detection, etc. In this
manuscript, we use supervised learning approaches to detect
anomalies in electricity consumption patterns and compare
results to evaluate their performance. The description of these
approaches is given below.
Techniques for NTL
detection
Hardware based
techniques
Data driven techiques
Game theory
techniques
Supervised learning Unsupervised
learning
Anamoly detection,
Heuristic approaches,
Generative models,
Graph based
methods
MLP, SVM, Decision
tree, Gradient
boosting classifiers,
LR, RF, CNN, LSTM
Association rule
mining, Clustering
methods, Regression
models
Semi supervised
learning
Fig. 2: Different techniques for electricity theft detection
Gradient boosting classifiers: Boosting is an ensemble
technique where, several weak learners are combined to form
a strong learner. There are many boosting techniques: random
forest, adaptive boosting, gradient boosting, etc. All of these
have different mechanisms to reduce the loss function. Gra-
dient boosting algorithms use gradient descent to reduce the
loss function. XGBoost is based on sequential learning, where
weak learners are trained through parallel implementation, that
increases the algorithm performance. It is designed in this
way to utilize maximum hardware resources. Cache and hard
drive are utilized efficiently to handle small and large datasets.
Besides, it used a weighted sketch algorithm to find optimal
splitting criteria. CatBoost is a latest member of the gradient
family toolkit. It is developed by the machine learning team
at Yandex. It has ability to handle the categorical features and
uses an order boosting strategy to avoid information leakage.
CatBoost uses an oblivious decision tree. These trees have
equal splitting criteria at each level and are less prone to
overfitting.
Support vector machine: SVM is a well-known classifier
in ETD. It is an extension of the maximal margin hyperplane.
In SVM, training data is fed into a classifier and results
are predicted through testing data. It uses kernel functions
to transform data into higher dimensions, where hyperplane
can be drawn to separate classes. Polynomial and radial base
kernels are used to handle non-linear data. Whereas, a linear
kernel is used to draw a decision boundary between classes of
linear data. In [2], the authors use SVM to capture abnormal
patterns from electricity consumption data.
Random forest: Random forest (RF) is an ensemble learn-
ing technique where multiple decision trees are integrated to
give a final decision through the majority of wisdom. It is
widely used for classification and regression tasks. Moreover,
it reduces the overfitting problem through multiple decision
trees mechanism. However, the main limitation of random
forest is that it contains a large number of decision trees which
makes it less efficient for real-world problems.
k-nearest neighbours algorithm: KNN is a simple and
easy to implement supervised machine learning algorithm. It
is used for both classification and regression tasks. However,
it is mostly used for classification tasks in the industry. KNN
assumes the concept of similarity measure to handle the
classification and regression problems. It is a lazy and non-
parametric learning algorithm. Its computation time, memory
and accuracy depend upon the nature of data. In [8], authors
use KNN with SVM to reduce the misclassification of data
points that are near the decision boundary.
Multilayer perceptron: Artificial neural network or multi-
layer perceptron is the first biological-inspired machine learn-
ing algorithm. It contains input, hidden and output layers.
It is successfully used for NTL detection in [1] and [2]. It
extracts the latent information from consumer’s consumption
to differentiate between normal and theft samples.
Long short term memory: LSTM networks are variants
of recurrent neural networks (RNN). Smart meter provides
historical data of any customer,, which can be yearly long
and increases on daily. RNN networks are unable to detect
abnormal patterns from long sequence data due to vanishing
and exploding gradient problems. LSTM is an enhanced
version of the RNN network, which solves these problem.
We use LSTM to capture longer patterns from the user’s
historical data. LSTM structure is similar to the RNN network
but different in internal components. The important part of
LSTM is its cell state, which acts as a transport highway that
transfers the relative pattern to way down sequence chain. It
has three gates to regulate the flow of information throughout
the network. Forget gates decide which information to keep
or discard from the cell state. Input gates use the sigmoid and
tanh function to update the cell state. The output gate predicts
the final output. It also decides the hidden state. All equations
5
TABLE I: Confusion matrix of all classifiers
Type of classifiers True positive True negative False positive False negative
XGBoost 58 379 6 7
CatBoost 54 367 10 19
RF 57 289 7 97
SVM-linear 50 331 14 55
KNN 63 340 1 40
MLP 54 338 10 48
LSTM 51 377 13 9
TABLE II: Precision, recall, F1-score and FPR of all classifiers
Type of classifiers Precision Recall F1-measure FPR
XGBoost 0.983 0.984 0.981 0.015
CatBoost 0.973 0.950 0.961 0.026
RF 0.976 0.748 0.847 0.023
SVM-linear 0.959 0.857 0.905 0.04
KNN 0.880 0.997 0.935 0.002
MLP 0.971 0.875 0.920 0.028
LSTM 0.966 0.976 0.971 0.033
of the LSTM network are given below [15].
ft=σ(wf(ht−1, xt),+bf)(13)
it=σ(wi(ht−1, xt),+bi)tanh(wi(ht−1, xt),+bi)(14)
Ct=ftCt−1+it(15)
ot=σ(wo(ht−1, xt),+bo)(16)
ht=ottanh(Ct)(17)
ft,itand otrepresent forget gate, input gate and output gate,
respectively. xt,wf,wiand wodenote current input, forget gate
weight, input gate weight and output gate weight. Ctand ht
represent current cell state and hidden state, respectively. Ct-1
and ht-1 denote previous cell state and hidden state. bf,biand
borepresent forget gate bias, input gate bias and output gate
bias. denotes point wise multiplication and σrepresents
sigmoid function. tanh function squashes the values between
-1 and +1.
IV. EXPERIMENTAL RESULTS
In this section, we evaluate the performance of all clas-
sifiers by performing extensive experiments. All of these are
implemented on Google Colab that is an open-source platform.
It is mostly used for machine learning analysis, by taking
advantage of distributed computing. SVM-linear, KNN and RF
are implemented through Scikit learn library. While LSTM
and MLP are trained and evaluated through the TensorFlow
library. LighBoost and CatBoost are also open source libraries
that are available on GitHub. One of contribution is that
we compare the performance of different classifiers and give
detailed analysis by utilizing different performance measures:
precision, recall, F1-score and FPR. Another contribution is
that we compare the efficiency of recently developed state-of-
the-art classifiers: XGBoost and CatBoost with conventional
machine learning and deep learning models. Table I shows the
confusion matrix of all implemented classifiers.
Precision, recall, F1-score and FPR are presented in Table
II. It is interesting to consider F1-score as a performance
measure to evaluate the classifiers’ performance. XGBoost
has the highest F1-measure value than CatBoost and RF. The
counterpart is that RF also belongs to the group of ensemble
learning but it has the lowest performance. KNN and SVM-
linear obtain F1-measure value 0.935 and 0.905, respectively.
KNN gives good results which means that normal and theft
classes are easily separable. LSTM and MLP classifiers belong
to the deep learning class. However, LSTM attains a 0.976 F1-
measure value that is more than MLP because it has memory
cells to remember the consumption patterns of theft and nor-
mal consumers. XGBoost gives the highest performance than
all other classifiers because it performs sequential learning
and reduces the misclassification rate by utilizing a gradient
descent algorithm.
(a) XGBoost, CatBoost, SVM-linear
(b) MLP, LSTM, RF, KNN
Fig. 3: ROC curve of all classifiers
Taking precision as an efficiency metric, ensemble classi-
fiers achieve high precision as compared to conventional and
deep learning models. XGBoost attains 0.983 precision value
that is more than all implemented classifiers. MLP attains
0.971 which is better than the LSTM precision value. Now,
we take recall as a performance measure metric. It is a ratio of
relevant results that are returned by a classifier. KNN obtains
the highest recall while MLP achieves the lowest value. FPR
is also known as misclassification rate. It is very important
for electric utilities because they have limited resources for
6
on-site inspection. KNN gives the lowest FPR value than
all other implemented classifiers. However, one drawback of
KNN is that it belongs to group of lazy learning classifiers.
These classifiers give good results on small datasets, while
their performance is drastically decreased on larger datasets.
Reciever operating characteristics curve (ROC Curve) is a
tool that is commonly exploited to access the performance of
machine learning models. It is ratio of true positive rate and
false positive rate on different threshold values between 0 and
1. Fig 3 shows ROC curve of all models to better evaluate
(a) XGBoost, CatBoost, SVM-linear
(b) MLP, LSTM, RF, KNN
Fig. 4: PR curve of all classifiers
their performance. XGBoost outperforms all classifiers and
achieves the highest ROC curve. While RF gives the lowest
performance. The remaining classifiers CatBoost, SVM-linear,
MLP, LSTM and KNN attain 0.971, 0.884, 0.914, and 0.965
AUC values of ROC curve, respectively. Precision recall curve
(PR curve) is another measure that is used to access the
classifier performance for imbalanced datasets. It is ratio of
precision and recall at different threshold[0 and 1]. The AUC
of PR curve gives a summary of the skilled classifiers.
PR curve of implemented classifiers are represented in
Fig. 4 to check their performance on the imbalanced dataset.
XGBoost outperforms all classifiers and achieves 0.997 AUC
value. While RF achieves the lowest AUC value. CatBoost,
SVM, MLP, LSTM and KNN get 0.995, 0.975, 0.995 and
0.994 AUC, respectively. In this research article, we manually
analyze the performance of conventional machine learning
and deep learning classifiers for NTL detection. However,
XGBoost outperforms the classifiers and achieves the highest
results. It has a built-in feature extraction module that removes
the noise, extracts optimal features and improves performance.
Moreover, it performs sequential learning, where weak learners
are trained in a sequenced manner and are combined in the end
to make a strong learner. Due to all of these reasons, XGBoost
performs better than all other tested classifiers.
V. CONCLUSION
In this manuscript, we exploit different supervised learning
methods: XGBoost, CatBoost, SVM, RF, KNN, MLP and
LSTM to detect anomalies in users’ consumption history. For a
consumer in a smart grid, we can easily obtain benign samples
from his consumption history. However, theft cases may not be
presented in consumption history. We use six cases to generate
malicious samples and argue that the purpose of theft is to
report less consumption than actual electricity consumption.
ADASYN is utilized to balance the ratio between benign
and theft samples. A realistic electricity consumption dataset
is utilized to train and evaluate all implemented classifiers,
provided by the electric utility of Pakistan. XGBoost outper-
forms and achieves 0.986 and 0.997 ROC-AUC and PR-AUC,
respectively. It has a built-in feature extraction module that
reduces noise, selects the optimal features and increases its
performance. Moreover, precision, recall, F1-score and FPR
is utilized to evaluate their performance of the classifiers.
REFERENCES
[1] Avila, Nelson Fabian, Gerardo Figueroa, and Chia-Chi Chu. ”NTL
detection in electric distribution systems using the maximal overlap
discrete wavelet-packet transform and random undersampling boosting.”
IEEE Transactions on Power Systems 33, no. 6 (2018): 7171-7180.
[2] Jokar, Paria, Nasim Arianpoo, and Victor CM Leung. ”Electricity
theft detection in AMI using customers’ consumption patterns.” IEEE
Transactions on Smart Grid 7, no. 1 (2015): 216-226.
[3] Punmiya, Rajiv, and Sangho Choe. ”Energy theft detection using gra-
dient boosting theft detector with feature engineering-based preprocess-
ing.” IEEE Transactions on Smart Grid 10, no. 2 (2019): 2326-2329.
[4] Khan, Zahoor Ali, Muhammad Adil, Nadeem Javaid, Malik Najmus
Saqib, Muhammad Shafiq, and Jin-Ghoo Choi. ”Electricity theft de-
tection using supervised learning techniques on smart meter data.”
Sustainability 12, no. 19 (2020): 8023.
[5] Arif, Arooj, Nadeem Javaid, Abdulaziz Aldegheishem, and Nabil Alra-
jeh. ”Big Data Analytics for Identifying Electricity Theft using Machine
Learning Approaches in Micro Grids for Smart Communities.”, Concur-
rency and Computation: Practice and Experience (2021): 1532-0634.
[6] Ghori, Khawaja Moyeezullah, Rabeeh Ayaz Abbasi, Muhammad Awais,
Muhammad Imran, Ata Ullah, and Laszlo Szathmary. ”Performance
analysis of different types of machine learning classifiers for non-
technical loss detection.” IEEE Access 8 (2019): 16033-16048.
[7] Razavi, Rouzbeh, Amin Gharipour, Martin Fleury, and Ikpe Justice
Akpan. ”A practical feature-engineering framework for electricity theft
detection in smart grids.” Applied energy 238 (2019): 481-494.
[8] Kong, Xiangyu, Xin Zhao, Chao Liu, Qiushuo Li, DeLong Dong, and Ye
Li. ”Electricity theft detection in low-voltage stations based on similarity
measure and DT-KSVM.” International Journal of Electrical Power &
Energy Systems 125 (2021): 106544.
[9] Buzau, Madalina Mihaela, Javier Tejedor-Aguilera, Pedro Cruz-Romero,
and Antonio G´
omez-Exp´
osito. ”Detection of non-technical losses using
smart meter data and supervised learning.” IEEE Transactions on Smart
Grid 10, no. 3 (2018): 2661-2670.
7
[10] Buzau, Madalina-Mihaela, Javier Tejedor-Aguilera, Pedro Cruz-Romero,
and Antonio G´
omez-Exp´
osito. ”Hybrid deep neural networks for detec-
tion of non-technical losses in electricity smart meters.” IEEE Transac-
tions on Power Systems 35, no. 2 (2019): 1254-1263.
[11] Zheng, Zibin, Yatao Yang, Xiangdong Niu, Hong-Ning Dai, and Yuren
Zhou. ”Wide and deep convolutional neural networks for electricity-
theft detection to secure smart grids.” IEEE Transactions on Industrial
Informatics 14, no. 4 (2017): 1606-1615.
[12] Huang, Yifan, and Qifeng Xu. ”Electricity theft detection based on
stacked sparse denoising autoencoder.” International Journal of Electrical
Power & Energy Systems 125 (2021): 106448.
[13] Fenza, Giuseppe, Mariacristina Gallo, and Vincenzo Loia. ”Drift-aware
methodology for anomaly detection in smart grid.” IEEE Access 7
(2019): 9645-9657.
[14] Bhat, Rajendra Rana, Rodrigo Daniel Trevizan, Rahul Sengupta, Xi-
aolin Li, and Arturo Bretas. ”Identifying nontechnical power loss via
spatial and temporal deep learning.” In 2016 15th IEEE International
Conference on Machine Learning and Applications (2016) 272-279.
[15] Hasan, Md, Rafia Nishat Toma, Abdullah-Al Nahid, M. M. Islam, and
Jong-Myon Kim. ”Electricity theft detection in smart grid systems: A
CNN-LSTM based approach.” Energies 12, no. 17 (2019): 3310.
[16] Ramos, Caio CO, Douglas Rodrigues, Andr´
e N. de Souza, and Jo˜
ao P.
Papa. ”On the study of commercial losses in Brazil: a binary black hole
algorithm for theft characterization.” IEEE Transactions on Smart Grid
9, no. 2 (2016): 676-683.
[17] Coma-Puig, Bernat, and Josep Carmona. ”Bridging the gap between en-
ergy consumption and distribution through non-technical loss detection.”
Energies 12, no. 9 (2019): 1748.
[18] Hu, Tianyu, Qinglai Guo, Hongbin Sun, Tian-En Huang, and Jian Lan.
”Nontechnical losses detection through coordinated biwgan and svdd.”
IEEE Transactions on Neural Networks and Learning Systems (2020):
2162-237.
[19] He, Haibo, Yang Bai, Edwardo A. Garcia, and Shutao Li. ”ADASYN:
Adaptive synthetic sampling approach for imbalanced learning.” In 2008
IEEE international joint conference on neural networks (2008): 1322-
1328..
[20] Javaid, Nadeem, Naeem Jan, and Muhammad Umar Javed. ”An adaptive
synthesis to handle imbalanced big data with deep siamese network
for electricity theft detection in smart grids.” Journal of Parallel and
Distributed Computing (2021): 0743-7315.