Data science is an emerging field, which has applications in multiple disciplines; like healthcare, advanced image recognition, airline route planning, augmented reality, targeted advertising, etc. In this thesis, we have exploited its applications in smart grids and financial markets with three major contributions. In the first two contributions, machine learning (ML) and deep learning (DL)
... [Show full abstract] models are utilized to detect anomalies in electricity consumption (EC) data, while in third contribution, upwards and downwards trends in the financial markets are predicted to give benefits to the potential investors.
Non-technical losses (NTLs) are one of the major causes of revenue losses for electric utilities. In the literature, various ML and DL approaches are employed to detect NTLs. The first solution introduces a hybrid DL model, which tackles the class imbalance problem and curse of dimensionality and low detection rate of existing models. The proposed model integrates benefits of both GoogLeNet and gated recurrent unit (GRU). The one dimensional EC data is fed into GRU to remember periodic patterns. Whereas, GoogLeNet model is leveraged to extract latent features from the two dimensional weekly stacked EC data. Furthermore, the time least square generative adversarial network (TLSGAN) is proposed to solve the class imbalance problem. The TLSGAN uses unsupervised and supervised loss functions to generate fake theft samples, which have high resemblance with real world theft samples. The standard generative adversarial network only updates the weights of those points that are available at the wrong side of the decision boundary. Whereas, TLSGAN even modifies the weights of those points that are available at the correct side of decision boundary, which prevent the model from vanishing gradient problem. Moreover, dropout and batch normalization layers are utilized to enhance model’s convergence speed and generalization ability. The proposed model is compared with different state-of-the-art classifiers including multilayer perceptron (MLP), support vector machine, naive bayes, logistic regression, MLP-long short term memory network
and wide and deep convolutional neural network. The second solution presents a framework, which is employed to solve the curse of dimensionality issue. In literature, the existing studies are mostly concerned with tuning the hyperparameters of ML/ DL methods for efficient detection of NTL, i.e., electricity theft detection. Some of them focus on the selection of prominent features from data to improve the performance of electricity theft detection. However, the curse of dimensionality affects the generalization
ability of ML/ DL classifiers and leads to computational, storage and overfitting problems.
Therefore, to deal with above-mentioned issues, this study proposes a system based on metaheuristic techniques (artificial bee colony and genetic algorithm) and denoising autoencoder for electricity theft detecton using big data in electric power systems. The former (metaheuristics) are used to select prominent features. While the latter are utilized to extract high variance features from electricity consumption data. First, new features are synthesized from statistical and electrical parameters from the user’s consumption history. Then, the synthesized features are used as input to metaheuristic techniques to find a subset of optimal features. Finally, the optimal features are fed as input to the denoising autoencoder to extract features with high variance. The ability of both techniques to select and extract features is measured using a support vector machine. The proposed system reduces the overfitting, storage and computational overhead of ML classifiers. Moreover, we perform several experiments to verify the effectiveness of our proposed system and results reveal that the proposed system has higher performance
our counterparts. The third solution introduces a hybrid DL model for prediction of upwards and downwards trends in financial market data. The financial market exhibits complex and volatile behavior that is difficult to predict using conventional ML and statistical methods, as well as shallow neural networks. Its behavior depends on many factors such as political upheavals, investor sentiment, interest rates, government policies, natural disasters, etc. However, it is possible to predict upward and downward trends in financial market behavior using complex DL models. This paper therefore addresses the following limitations that adversely affect the performance of existing ML and DL models, i.e., the curse of dimensionality, the low accuracy of the standalone
models, and the inability to learn complex patterns from high-frequency time series data. The denoising autoencoder is used to reduce the high dimensionality of the data, overcoming the problem of overfitting and reducing the training time of the ML and DL models. Moreover, a hybrid DL model HRG is proposed based on a ResNet module and gated recurrent units. The former is used to extract latent or abstract patterns that are not visible to the human eye, while the latter retrieves temporal patterns from the financial market dataset. Thus, HRG integrates the advantages of both models. It is evaluated on real-world financial market datasets obtained from IBM, APPL, BA and WMT . Also, various performance indicators such as f1-score, accuracy, precision, recall, receiver operating characteristic-area under the curve (ROC-AUC) are used to check the performance of the proposed and benchmark models. The RG 2 achieves 0.95, 0.90, 0.82 and 0.80 ROC-AUC values on APPL, IBM, BA and WMT datasets respectively, which are higher than the ROC-AUC values of all implemented ML and DL models.