Electricity is widely used around 80% of the world, which depicts the significance
of secure and efficient use of electricity. Nontechnical losses (NTLs) become one of
the biggest issues for electric utilities around the world. The intentional malfunctioning
with electric meters and false data injection cover the largest proportion
of NTLs and have hazardous effects on the power systems. The digitalization of
traditional grids gives a new revolution for an efficient exchange of information
over short periods, which allows the electric utilities to devise innovative datadriven
solutions for electricity theft detection (ETD). The existing data-driven
approaches for ETD have limited ability to handle high-dimensional, noisy and
imbalanced data. Moreover, these approaches have limited potential to derive
features' associations during feature extraction. These limitations raise the misclassification rate, which makes the existing ETD approaches unacceptable for
electric utilities. Therefore, in this thesis, we propose a new data-driven methodology,
which consists of four new solutions to systematically detect the electricity
fraudsters in the smart grid environment. Particularly, in first system model,
we present a new class balancing mechanism based on the interquartile minority
oversampling technique to handle the data imbalance issues. Then, a combined
ETD model composed of long short-term memory (LSTM), UNet and adaptive
boosting (Adaboost), termed as LSTM-UNet-Adaboost, is presented to detect
electricity frauds. Afterwards, in second solution, we introduce a new mechanism
that is based on two scenarios. In the first scenario, a new supervised learning
based mechanism is presented, which is a combination of UNet and generative adversarial
network (GAN), named as UNet-GAN. The GAN's structure is mainly
comprised of two neural networks: generator and discriminator. Due to the excellent
performance of UNet, we utilize it in both generator and discriminator parts.
These two neural networks contest with each other in a game-theoretic manner
to significantly boost the ETD performance. In the second scenario, we propose
a new dynamic learning based semi-supervised solution, which consists of probabilistic
guider (PG) and Ladder network. This solution is termed as PG-Ladder
network. The PG dynamically guides the proposed PG-Ladder network to further
improve its performance in terms of ETD. Furthermore, the conventional approaches
require extensive experts' involvement and lose data relationships during
feature extraction for effective theft detection. Therefore, in third system model,
we solve these issues by presenting the new solution that is based on relational denoising
autoencoder (RDAE) with the attention guided (AG) TripleGAN, named
as RDAE-AG-TripleGAN. The limitations of conventional clustering mechanisms
and scarcity of labeled electricity consumption (EC) data are solved by presenting
the new two-fold end-to-end semi-supervised solution, referred as fourth solution.
In the first fold, it groups the similar EC cases by employing the grey wolf optimization
(GWO) based clustering mechanism, namely clustering by fast search
and find of density peaks (CFSFDP), known as GC. In the second fold, we design
a new relational stacked denoising autoencoder (RSDAE) enabled semi-supervised
GAN, termed as RGAN, for ETD. Therefore, the combined solution is named as
GC-RGAN. In the system, RSDAE performs as feature extractor and the generator
model of proposed RGAN. The proposed semi-supervised solutions efficiently
gain the potential benefits of both labeled and unlabeled representations. Furthermore,
the proposed solutions are simulated and evaluated over the real-time smart
meter dataset of state grid corporation of China using the most suitable performance
indicators, e.g., area under the curve and Mathews correlation coefficient.
The simulation outcomes validate that the proposed methodology surpasses other
traditional methods, such as semi-supervised support vector machine and random
forest, for ETD and become acceptable for real-time practices.