Figure - available from: Multimedia Tools and Applications
This content is subject to copyright. Terms and conditions apply.
Source publication
Generative adversarial networks (GANs), a novel framework for training generative models in an adversarial setup, have attracted significant attention in recent years. The two opposing neural networks of the GANs framework, i.e., a generator and a discriminator, are trained simultaneously in a zero-sum game, where the generator generates images to...
Citations
... GANs consist of two neural networks, a generator and a discriminator, which are in an adversarial setup with each other [104]. GANs have been applied to generate images, videos, and even audio that seem real. ...
Machine learning (ML) and deep learning (DL), subsets of artificial intelligence (AI), are the core technologies that lead significant transformation and innovation in various industries by integrating AI-driven solutions. Understanding ML and DL is essential to logically analyse the applicability of ML and DL and identify their effectiveness in different areas like healthcare, finance, agriculture, manufacturing, and transportation. ML consists of supervised, unsupervised, semi-supervised, and reinforcement learning techniques. On the other hand, DL, a subfield of ML, comprising neural networks (NNs), can deal with complicated datasets in health, autonomous systems, and finance industries. This study presents a holistic view of ML and DL technologies, analysing algorithms and their application’s capacity to address real-world problems. The study investigates the real-world application areas in which ML and DL techniques are implemented. Moreover, the study highlights the latest trends and possible future avenues for research and development (R&D), which consist of developing hybrid models, generative AI, and incorporating ML and DL with the latest technologies. The study aims to provide a comprehensive view on ML and DL technologies, which can serve as a reference guide for researchers, industry professionals, practitioners, and policy makers.
... These networks are trained adversarially, leading to a model that learns the underlying data distribution and can reconstruct highly accurate outputs, even in high-noise environments. Through this adversarial training framework, the model can perform denoising in a nuanced and adaptive manner-an achievement that pure error-based strategies often struggle to attain [30]. Fig. 3 illustrates the complete GAN framework and the model structure of the discriminator. ...
The growing complexity of cyber threats requires innovative machine learning techniques, and image-based malware classification opens up new possibilities. Meanwhile, existing research has largely overlooked the impact of noise and obfuscation techniques commonly employed by malware authors to evade detection, and there is a critical gap in using noise simulation as a means of replicating real-world malware obfuscation techniques and adopting denoising framework to counteract these challenges. This study introduces an image denoising technique based on a U-Net combined with a GAN framework to address noise interference and obfuscation challenges in image-based malware analysis. The proposed methodology addresses existing classification limitations by introducing noise addition, which simulates obfuscated malware, and denoising strategies to restore robust image representations. To evaluate the approach, we used multiple CNN-based classifiers to assess noise resistance across architectures and datasets, measuring significant performance variation. Our denoising technique demonstrates remarkable performance improvements across two multi-class public datasets, MALIMG and BIG-15. For example, the MALIMG classification accuracy improved from 23.73% to 88.84% with denoising applied after Gaussian noise injection, demonstrating robustness. This approach contributes to improving malware detection by offering a robust framework for noise-resilient classification in noisy conditions.
... However, as GANs evolve further, the sheer number of variants and models becomes prohibitive for researchers and practitioners looking to choose the optimal architecture for specific tasks. Several GAN architectures were implemented, which included the conditional GAN (CGAN), Deep Convolutional GAN (DCGAN), and the Wasserstein GAN (WGAN) [1,2], to counter some of the weaknesses inherent in the basic GAN architecture, such as instability during training, mode collapse, and failure to generate high-resolution images [3,4]. Each model improves on one of the weaknesses cited but introduces others simultaneously. ...
The growing spectrum of Generative Adversarial Network (GAN) applications in medical imaging, cyber security, data augmentation, and the field of remote sensing tasks necessitate a sharp spike in the criticality of review of Generative Adversarial Networks. Earlier reviews that targeted reviewing certain architecture of the GAN or emphasizing a specific application-oriented area have done so in a narrow spirit and lacked the systematic comparative analysis of the models’ performance metrics. Numerous reviews do not apply standardized frameworks, showing gaps in the efficiency evaluation of GANs, training stability, and suitability for specific tasks. In this work, a systemic review of GAN models using the PRISMA framework is developed in detail to fill the gap by structurally evaluating GAN architectures. A wide variety of GAN models have been discussed in this review, starting from the basic Conditional GAN, Wasserstein GAN, and Deep Convolutional GAN, and have gone down to many specialized models, such as EVAGAN, FCGAN, and SIF-GAN, for different applications across various domains like fault diagnosis, network security, medical imaging, and image segmentation. The PRISMA methodology systematically filters relevant studies by inclusion and exclusion criteria to ensure transparency and replicability in the review process. Hence, all models are assessed relative to specific performance metrics such as accuracy, stability, and computational efficiency. There are multiple benefits to using the PRISMA approach in this setup. Not only does this help in finding optimal models suitable for various applications, but it also provides an explicit framework for comparing GAN performance. In addition to this, diverse types of GAN are included to ensure a comprehensive view of the state-of-the-art techniques. This work is essential not only in terms of its result but also because it guides the direction of future research by pinpointing which types of applications require some GAN architectures, works to improve specific task model selection, and points out areas for further research on the development and application of GANs.
... Training continues until the discriminator can no longer distinguish between real and generated data. GANs gradually improve their performance through this dynamic interaction and generate synthetic data that closely resemble realworld examples [80,81]. DCGANs specifically use CNNs in both the generator and the discriminator. ...
This study aims to improve the efficiency of mineral exploration by intro-ducing a novel application of Deep Convolutional Generative Adversarial Networks (DCGANs) to augment geological evidence layers. By training a DCGAN model with existing geological, geochemical, and remote sensing data, we have synthesized new, plausible layers of evidence that reveal unrecognized patterns and correlations. This approach deepens the understanding of the controlling factors in the formation of mineral deposits. The implications of this research are significant and could improve the efficiency and success rate of mineral exploration projects by providing more relia-ble and comprehensive data for decision-making. The predictive map created using the proposed feature augmentation technique covered all known deposits in only 18% of the study area.
... Mode collapse, where the generator fails to cover the full diversity of the target data, remains a key issue [6,7]. Strategies include manifold-preserving GANs (MP-GAN) [8], which apply entropy maximization on the data manifold, and mutual information maximization in models like InfoMax-GAN [9]. ...
... For example, the presence of sharp change points in financial time series data due to noise contamination severely degrades the prediction performance of these models [12]. Therefore, with the rapid advancement in deep learning technology, more complex models are also being tried for financial time series prediction, including transformers [13], Generative Adversarial Networks (GANs) [14], Reinforcement Learning (RL) [15], and so on. Meanwhile, to overcome deficiencies associated with individual models, a series of hybrid models have also been implemented to enhance the prediction accuracy of financial time series [3,5]. ...
Financial time series prediction is a fundamental problem in investment and risk management. Deep learning models, such as multilayer perceptrons, Convolutional Neural Networks (CNNs), and Long Short-Term Memory (LSTM), have been widely used in modeling time series data by incorporating historical information. Among them, LSTM has shown excellent performance in capturing long-term temporal dependencies in time-series data, owing to its enhanced internal memory mechanism. In spite of the success of these models, it is observed that in the presence of sharp changing points, these models fail to perform. To address this problem, we propose, in this article, an innovative financial time series prediction method inspired by the Deep Operator Network (DeepONet) architecture, which uses a combination of transformer architecture and a one-dimensional CNN network for processing feature-based information, followed by an LSTM based network for processing temporal information. It is therefore named the CNN–LSTM–Transformer (CLT) model. It not only incorporates external information to identify latent patterns within the financial data but also excels in capturing their temporal dynamics. The CLT model adapts to evolving market conditions by leveraging diverse deep-learning techniques. This dynamic adaptation of the CLT model plays a pivotal role in navigating abrupt changes in the financial markets. Furthermore, the CLT model improves the long-term prediction accuracy and stability compared with state-of-the-art existing deep learning models and also mitigates adverse effects of market volatility. The experimental results show the feasibility and superiority of the proposed CLT model in terms of prediction accuracy and robustness as compared to existing prediction models. Moreover, we posit that the innovation encapsulated in the proposed DeepONet-inspired CLT model also holds promise for applications beyond the confines of finance, such as remote sensing, data mining, natural language processing, and so on.
... There is another class of deep learning model, known as Generative Adversarial Network (GAN) [48][49][50], which can produce highly realistic images. However, it can also inadvertently introduce artifacts resembling real structures [51][52][53], such as the spurious features like additional vessel-like patterns in PA imaging. These spurious features arise because the generator overlearns certain features in an attempt to fool the discriminator, leading to structures that resemble real vessels but are not present in the original image. ...
Recent advances in Light Emitting Diode (LED) technology have enabled a more affordable high frame rate photoacoustic imaging (PA) alternative to traditional laser-based PA systems that are costly and have slow pulse repetition rate. However, a major disadvantage with LEDs is the low energy outputs that do not produce high signal-to-noise ratio (SNR) PA images. There have been recent advancements in integrating deep learning methodologies aimed to address the challenge of improving SNR in LED-PA images, yet comprehensive evaluations across varied datasets and architectures are lacking. In this study, we systematically assess the efficacy of various Encoder-Decoder-based CNN architectures for enhancing SNR in real-time LED-based PA imaging. Through experimentation with in vitro phantoms, ex vivo mouse organs, and in vivo tumors, we compare basic convolutional autoencoder and U-Net architectures, explore hierarchical depth variations within U-Net, and evaluate advanced variants of U-Net. Our findings reveal that while U-Net architectures generally exhibit comparable performance, the Dense U-Net model shows promise in denoising different noise distributions in the PA image. Notably, hierarchical depth variations did not significantly impact performance, emphasizing the efficacy of the standard U-Net architecture for practical applications. Moreover, the study underscores the importance of evaluating robustness to diverse noise distributions, with Dense U-Net and R2 U-Net demonstrating resilience to Gaussian, salt and pepper, Poisson, and Speckle noise types. These insights inform the selection of appropriate deep learning architectures based on application requirements and resource constraints, contributing to advancements in PA imaging technology.
... When all pixels of an image are completely processed, a complete image is produced by a model that is trained for this specific task. Many tasks are carried out with convolutional neural networks, such as classification [27,28], segmentation [29,30], restoration [31,32], object detection [33,34], among other tasks [35][36][37][38][39]. ...
Fringe profilometry is a method that obtains the 3D information of objects by projecting a pattern of fringes. The three-step technique uses only three images to acquire the 3D information from an object, and many studies have been conducted to improve this technique. However, there is a problem that is inherent to this technique, and that is the quasi-periodic noise that appears due to this technique and considerably affects the final 3D object reconstructed. Many studies have been carried out to tackle this problem to obtain a 3D object close to the original one. The application of deep learning in many areas of research presents a great opportunity to to reduce or eliminate the quasi-periodic noise that affects images. Therefore, a model of convolutional neural network along with four different patterns of frequencies projected in the three-step technique is researched in this work. The inferences produced by models trained with different frequencies are compared with the original ones both qualitatively and quantitatively.
... One of the most critical challenges in training GANs is mode collapse, where the generator fails to capture the full diversity of the data distribution 25,26 . In the context of tabular data, mode collapse can result in the generator producing a limited range of outputs that do not adequately reflect the various categories or numerical ranges present in the original dataset. ...
... Mode collapse is often identified when the generator outputs overly similar results or when the generator's loss curve becomes flat, indicating a failure to effectively learn the data distribution. To mitigate mode collapse, several techniques, including minibatch discrimination and spectral normalization, have been introduced 25,26 . Minibatch discrimination encourages the generator to produce a greater diversity of outputs by allowing the discriminator to assess patterns across a batch of data, rather than evaluating individual samples independently. ...
... Spectral normalization, on the other hand, stabilizes GAN training by controlling the Lipschitz constant of the discriminator, ensuring that the generator learns meaningful features while avoiding mode collapse. These techniques collectively enhance the generator's ability to produce outputs that more accurately reflect the original data distribution 25,26 . ...
Credit scoring models are critical for financial institutions to assess borrower risk and maintain profitability. Although machine learning models have improved credit scoring accuracy, imbalanced class distributions remain a major challenge. The widely used Synthetic Minority Oversampling TEchnique (SMOTE) struggles with high-dimensional, non-linear data and may introduce noise through class overlap. Generative Adversarial Networks (GANs) have emerged as an alternative, offering the ability to model complex data distributions. Conditional Wasserstein GANs (cWGANs) have shown promise in handling both numerical and categorical features in credit scoring datasets. However, research on extracting latent features from non-linear data and improving model explainability remains limited. To address these challenges, this paper introduces the Non-parametric Oversampling Technique for Explainable credit scoring (NOTE). The NOTE offers a unified approach that integrates a Non-parametric Stacked Autoencoder (NSA) for capturing non-linear latent features, cWGAN for oversampling the minority class, and a classification process designed to enhance explainability. The experimental results demonstrate that NOTE surpasses state-of-the-art oversampling techniques by improving classification accuracy and model stability, particularly in non-linear and imbalanced credit scoring datasets, while also enhancing the explainability of the results.
... Generally, the instability of the training process, which ultimately results in non-convergence, is one issue. Utilizing the Adam optimizer with meticulously selected learning rates for both the generator and discriminator helps maintain stable updates and does not allow for oscillations [28,29]. ...
The increased use of Internet of Things (IoT) devices has led to greater threats to privacy and security. This has created a need for more effective cybersecurity applications. However, the effectiveness of these systems is often limited by the lack of comprehensive and balanced datasets. This research contributes to IoT security by tackling the challenges in dataset generation and providing a valuable resource for IoT security research. Our method involves creating a testbed, building the `Joint Dataset’, and developing an innovative tool. The tool consists of two modules: an Exploratory Data Analysis (EDA) module, and a Generator module. The Generator module uses a Conditional Generative Adversarial Network (CGAN) to address data imbalance and generate high-quality synthetic data that accurately represent real-world network traffic. To showcase the effectiveness of the tool, the proportion of imbalance reduction in the generated dataset was computed and benchmarked to the BOT-IOT dataset. The results demonstrated the robustness of synthetic data generation in creating balanced datasets.