Access to this full-text is provided by Tech Science Press.
Content available from Computers, Materials & Continua
This content is subject to copyright. Terms and conditions apply.
ech
T
PressScience
Doi:./cmc..
ARTICLE
Enhancing Malware Detection Resilience: A U-Net GAN Denoising Framework
for Image-Based Classication
Huiyao Dong1and Igor Kotenko2,*
Faculty of Information Technology and Security, ITMO National Research University, St. Petersburg, , Russia
Laboratory of Computer Security Problems, St. Petersburg Federal Research Center of the Russian Academy of Sciences, St.
Petersburg, , Russia
*Corresponding Author: Igor Kotenko. Email: ivkote@comsec.spb.ru
Received: December ; Accepted: January ; Published: March
ABSTRACT: e growing complexity of cyber threats requires innovative machine learning techniques, and image-
based malware classication opens up new possibilities. Meanwhile, existing research has largely overlooked the impact
of noise and obfuscation techniques commonly employed by malware authors to evade detection, and there is a critical
gap in using noise simulation as a means of replicating real-world malware obfuscation techniques and adopting
denoising framework to counteract these challenges. is study introduces an image denoising technique based on
a U-Net combined with a GAN framework to address noise interference and obfuscation challenges in image-based
malware analysis. e proposed methodology addresses existing classication limitations by introducing noise addition,
which simulates obfuscated malware, and denoising strategies to restore robust image representations. To evaluate the
approach, we used multiple CNN-based classiers to assess noise resistance across architectures and datasets, measuring
signicant performance variation. Our denoising technique demonstrates remarkable performance improvements
across two multi-class public datasets, MALIMG and BIG-. For example, the MALIMG classication accuracy
improved from .% to .% with denoising applied aer Gaussian noise injection, demonstrating robustness. is
approach contributes to improving malware detection by oering a robust framework for noise-resilient classication
in noisy conditions.
KEYWORDS: Malware; cybersecurity; deep learning; denoising
1Introduction
Malware variants are rapidly becoming one of the most serious threats to IoT systems. According to
SonicWall’s Mid-year Cyber reat Report, malware attacks have increased by % over the same period
last year, with an increase in IoT malware (+%) []. According to their data, targeted IoT devices have an
average attack time of . h. Hence, malware detection and analysis systems play a pivotal role in identifying
and mitigating threats as they emerge, employing various techniques such as signature-based detection,
anomaly detection, and heuristics. However, the dynamic nature of malware necessitates the adoption of
advanced methodologies to enhance detection accuracy and reduce false positives.
In recent years, machine learning (ML)-based approaches have gained prominence due to their capacity
to learn complex patterns associated with malware behavior, and the evolving landscape of malware has
prompted researchers to explore innovative approaches for classication, including the utilization of image-
based representations of malware binaries. By converting executable les into visual formats, such as
Copyright © e Authors. Published by Tech Science Press.
is work is licensed under a Creative Commons Attribution . International License, which permits
unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Comput Mater Contin. ;()
grayscale or RGB images, it becomes feasible to apply deep learning (DL) techniques to classify malware
based on its visual characteristics. e conversion of images may involve methods such as extracting features
from Local Binary Patterns (LBP) [] or transforming binary les into spectrogram images for subsequent
classication []. Deep learning techniques, such as Autoencoders (AEs) and Variational Autoencoders
(VAEs), in conjunction with hybrid classiers, can eectively learn complex patterns present in image-based
malware samples []. However, the high dimensionality of image representations presents challenges in
training models eciently, as training on such complex data necessitates substantial computational resources
and time []. Transfer learning, which leverages pre-trained convolutional neural networks (CNNs) that
utilize previously acquired knowledge from large datasets, can eectively address this challenge by ne-
tuning models on specic malware data []. is intersection of image processing and malware analysis
opens new frontiers in the search for robust and ecient malware classication systems.
With recent advances in image-based malware analysis, ensuring reliable and eective security measures
remains a signicant challenge. For example, code injection has been shown to signicantly impact the
performance of deep learning (DL)-based classiers. Injecting a code sequence into the beginning or middle
of an executable le can alter its image representation. Consequently, the modied pixel representation
may introduce patterns or distortions that could confuse or mislead classiers. Furthermore, many existing
approaches show a limited ability to handle obfuscated images, as this critical problem has received insu-
cient attention in prior research. Obfuscation serves as a crucial mechanism to protect sensitive information
by reducing data identiability []. It can pose challenges for malware analysis models, as identiable features
associated with malware have been altered. Although innovative transformation techniques can be employed
to convert soware into image-like arrays, the introduced noise and complexity necessitate advanced, robust
models capable of tolerating and extracting essential information from altered and dynamic structures.
Popular approaches such as data augmentation, image denoising algorithms, and adversarial training have
proven helpful in mitigating obfuscation and noise in malware samples. However, these approaches are still
not eective in addressing malware-specic challenges.
Denoising for malware imagery data presents unique and signicant challenges due to the distinct
characteristics of malware. Malware images, derived by transforming binary les into visual representations,
encode subtle patterns within the le content. ese small-scale structural features are critical for accurate
malware classication, but the presence of noise can easily disrupt these localized features, complicating the
reconstruction of the original signal. In most denoising applications, noise is incidental (e.g., sensor noise
in medical images or environmental noise in natural images). In contrast, noise in malware imagery data is
deliberately introduced as an evasion technique, making it considerably more dicult to detect and remove.
Conventional denoising methods are generally inadequate against the adversarial and structured nature of
such noise. Additionally, malware classication is hindered by factors such as class imbalance, high dataset
diversity, and the absence of pre-labeled “ground truth” in the malware imagery domain. Consequently,
denoising models must not only generalize beyond synthetic noise distributions but also eectively restore
meaningful patterns, particularly for malware samples from underrepresented classes. Although denoising
has been extensively studied in domains such as medical imaging and natural images, it remains a relatively
unexplored area in the context of malware imagery. Previous work employing a Variational Autoencoder-
Generative Adversarial Network (VAE-GAN) approach suggested that denoising techniques could be
integrated into the architecture to enhance the robustness of malware detection, although specic denoising
techniques were not explicitly mentioned []. Similarly, Bensaoud et al. [] emphasized that a multi-task
learning-based framework could improve classication performance and potentially involve handling noise
in the data. However, they did not elaborate on the denoising technique or encryption detection.
Comput Mater Contin. ;()
In this paper, we propose an image denoising technique based on the U-Net and GAN frameworks.
Initially, we implemented a noise addition technique to simulate obfuscated and noisy malware samples.
Subsequently, we designed a CNN-based U-Net model, incorporating multiple encoding and decoding
blocks for eective feature extraction and image reconstruction. For a comprehensive comparison, we used
several classiers to perform the malware classication task on original samples, samples with noise addition,
and denoised samples. We selected two public datasets with multiple classes to test the model’s robustness
and adaptability. e experimental results demonstrate that while the noise addition simulation signicantly
aects the generalizability of the classiers, the application of the U-Net denoising model can enable the
classiers to yield results comparable to those obtained from the original, noise-free samples.
e remainder of the paper is organized as follows. Section summarizes existing research in image-
based malware analysis; Section outlines the proposed techniques, including noise simulation and the
denoising model; Section presents the experimental results; Section discusses the implementation
and application of the model, along with future plans. Finally, the paper concludes with a summary of
our ndings.
2RelatedWork
ML models can classify and predict malware variants with remarkable precision, thus enhancing
traditional detection techniques. By extracting features from Executable and Linkable Format (ELF) les and
opcode sequences, it becomes feasible to identify malicious behaviors in Internet of ings (IoT) devices
using algorithms such as Random Forest, Support Vector Machines (SVM), and Neural Networks [].
Furthermore, it is practical to utilize ensemble learning models to analyze system calls generated by Android
applications to identify malicious behaviors []. In the domain of image-based malware classication, pre-
trained CNN models, particularly EcientNet, have emerged as prevalent tools for learning malware image
representations with standardized image widths []. Dao et al. [] proposed a lightweight architecture
that combined small CNN models with an Variational Autoencoder, resulting in superior performance and
eciency in malware classication. Further innovations have explored hybrid approaches and alternative
architectures to improve classication accuracy. akur et al. [] proposed a robust malware classication
framework that utilizes a comprehensive voting scheme among various CNN-LSTM classiers. In contrast,
Karatetal.[] advocated the use of similar hybrid classiers, specically CNN-LSTM, but placed the empha-
sis on behavioral analysis as a crucial element in identifying Zero-Day malware. eir approach incorporated
extensive information, including log parsing and API monitoring, to enable real-time behavioral analysis.
Dong [] proposed a sophisticated hybrid approach combining GAN for synthetic oversampling with a
simplied Vision Transformer (ViT)-based classier for malware image classication.
Although datasets containing obfuscated malware variants and specimens with subtle code injections
remain scarce, these adversarial scenarios can be systematically simulated through the application of con-
trolled noise addition to malware imagery data. is methodology enables the evaluation of the resilience of
the detection system against potential obfuscation techniques and minor code modications, while providing
a reproducible framework for robustness assessment. Common approaches in this eld include hybrid noise
addition, blind denoising, and the use of additive white noisy images or real-world noisy images. Although
CNNs and GANs are commonly used in image denoising research, as highlighted in a recent review [],
several innovative techniques have been introduced. Su et al. [] proposed a CNN-based multi-scale cross-
path concatenation residual network with multiple skip connections specically for Poisson image denoising.
Similarly, Jifara et al. [] utilized a CNN-based deep model featuring residual connections to enhance
medical image denoising. Yuan et al. [] introduced HSID-CNN, a nonlinear mapping technique for noisy
and clean images, utilizing a combined spatial-spectral deep CNN model. is model executes multiscale
Comput Mater Contin. ;()
feature extraction and multilevel feature representation to eectively capture spatial-spectral features.
Although the development of denoising techniques in AI, particularly in medical image processing, has
progressed for several years with demonstrably eective outcomes, there remains a scarcity of applications
using image denoising techniques for malware detection and classication. Li et al. [] presented a malware
classication method incorporating a dedicated denoising module through dynamic analysis and embedding
techniques to encode API sequences, introducing a so threshold mechanism for active noise ltering and
employing a bidirectional LSTM model for classication. Kumi et al. [] proposed a Block-matching and
D ltering algorithm alongside a deep CNN-based denoising technique to address adversarial examples.
A recent study aimed at enhancing the robustness of classiers against adversarial attacks introduced the
Dierential Privacy-Based Noise Clipping defense mechanism [], assisting classiers from being misled
by data poisoning and gradient-based data poisoning. It modied the training framework to render models
inherently resistant to adversarial noise. All of these techniques were integrated into the training phase of
the deep learning classier, directly altering the learning process. However, these studies did not investigate
various types of noise and provided limited discussion of performance analysis both before and aer noise
injection, as well as aer the application of denoising techniques, which represents a signicant gap.
e cited work underscores the SOTA techniques in image-based malware classication and the
eectiveness of denoising techniques in various elds, but it reveals a notable gap in the exploration and
application of denoising methods for malware detection and classication tasks. is research addresses
this need by focusing on Gaussian and Speckle noise types, employing a U-Net architecture for denoising.
e proposed model is evaluated using multiple CNN-based classiers and tested across two datasets to
rigorously assess both robustness and generalizability. Ultimately, this research aims to ll the existing gap
by providing a comprehensive analysis of denoising techniques in the context of malware classication,
contributing to the advancement of noise injection strategies in security applications.
3Methodology
3.1 Noise Simulation for Obfuscation Techniques
Obfuscation fundamentally alters the code structure of executable les through various techniques,
such as introducing noise, varying features due to alterations in code and data order and form, increasing
data complexity, and creating pattern and signature misalignment. Aer being transferred to an image-
like array, as a signicant number of the original bytes are inserted or altered, the likelihood of losing
critical features that dierentiate malware from benign soware increases, potentially resulting in a decline
in detection accuracy.
In the implementation of image-based obfuscation simulation, multiple technical approaches like
pixel-level transformations, geometric transformations, and adversarial concealment methodologies can be
employed []. is research implements two noise addition methods to simulate obfuscation: Gaussian
noise and Speckle noise.
Gaussian noise represents a fundamental concept in signal processing and image simulations, as it
accurately approximates the natural noise patterns inherent in real-world signals, particularly in imaging
systems. is type of noise is ubiquitous, manifesting itself in virtually all signal transmissions, and is
particularly important for obfuscation and privacy simulations due to its universality and versatility [].
e distinguishing characteristic of Gaussian noise lies in its normal distribution properties, rendering it
particularly suitable for simulating random perturbations across diverse applications. Within the domain of
image obfuscation and simulation methodologies, Gaussian noise serves as an indispensable tool for image
Comput Mater Contin. ;()
data processing, functioning both to evaluate the resilience of denoising algorithms and as a mechanism to
obscure sensitive information in privacy-critical images.
e implementation of Gaussian noise addition follows several steps. Firstly, the image data is normal-
ized to a range of [, ] by dividing the image array by . to prevent pixel overow when noise is added.
en, generates Random Gaussian noise with a specied mean and a standard deviation. Lastly, the noise
is added element-wise to the normalized image, producing the noisy image. Mathematically, the process of
transformation from original image to noisy one can be represented using Eq. ().
Inois y(x,y,c)=Iori gi nal (x,y,c)+N(x,y,c)()
In which the term I(x,y,c)represents the pixel value of an image at position (x,y)for channel c,
and N(x,y,c)representstheGaussiannoiseaddedtoeachpixel.isnoiseissampledfromthenormal
distribution as shown in Eq. ().
N(x,y,c)∼N(μ,σ2)()
In which the μstands for the mean and σstands for the standard deviation of the distribution. In this
research, we set the mean value at and standard deviation as a parameter.
Speckle noise, in contrast to Gaussian noise, exhibits multiplicative characteristics and originates
from coherent processes such as laser or radar imaging, where constructive and destructive interference
patterns introduce signal corruption []. is noise variant demonstrates relevance in simulating real-world
distortions encountered in medical imaging applications, including magnetic resonance imaging (MRI), and
in remote sensing systems. Its multiplicative properties alter pixel intensities in proportion to their original
values, rendering it an essential component in realistic obfuscation simulations that aim to model these
specic imaging scenarios. e mathematical representation can be described as Eq. ().
Inois y(x,y,c)=Iori gi nal (x,y,c)+N(x,y,c)×Iori gi nal (x,y,c)()
In which speckle noise sampled from a standard normal distribution, scaled by the noise factor. is
operation creates multiplicative distortion proportional to the original image’s pixel values.
Speckle noise, being prevalent in numerous real-world imaging systems, provides an accurate simula-
tion of these environments when incorporated into experimental frameworks []. e multiplicative nature
of speckle noise, which inherently correlates with original pixel values, renders it particularly eective in
obfuscating ne features within images, such as textures and high-frequency details.
To demonstrate the eects of dierent noise types, we compared original image samples with their
corresponding versions containing Gaussian and speckle noise. In both cases, the noise_factor,namely
the standard deviation value, was set to a high value of . to clearly illustrate the distinctions. As shown
in Fig. , Gaussian noise introduces dynamic perturbations that substantially obscure the original texture,
whereas speckle noise maintains the underlying textural characteristics. is distinction arises from their
fundamental mathematical properties and their respective interactions with the original image. Gaussian
noise is characterized by its additive nature, where noise is independently applied to each pixel irrespective
of its original intensity, uniformly aecting all image regions regardless of their intensity or texture. is
superimposition of a completely independent random pattern eectively masks the original texture. In
contrast, speckle noise exhibits multiplicative properties, where the noise eect is proportional to the original
pixel intensity. is results in texture preservation, as low-intensity regions experience minimal noise impact
Comput Mater Contin. ;()
while high-intensity regions undergo more substantial modications. Consequently, this proportional
relationship facilitates the preservation of original texture patterns despite the noise introduction.
Figure 1: Malware samples with dierent noise addition
3.2 Denoising Workow with U-Net
e denoising workow functions as a comprehensive strategy for mitigating image-based obfuscation.
In the initial stage, we conduct extensive training of the denoising model on a substantial dataset, enabling it
to eectively generate denoised images from those that have undergone noise addition. Aer the model has
converged, the trained weights are saved for future usage. Subsequently, we apply this model following data
preprocessing and prior to classication to execute the denoising operation.
Comput Mater Contin. ;()
e U-Net architecture is widely employed in computer vision for both image denoising []andsuper-
resolution [] tasks, owing to its prociency in learning spatial hierarchies and context-rich features. is
architecture comprises a deep encoder and a deep decoder. e encoder functions as a feature extractor,
wherein successive convolutional layers and max pooling operations diminish the spatial dimensions of the
image while preserving meaningful high-level features. Conversely, the decoder progressively upsamples
thesefeaturestorestoretheoriginalresolution.Inthecontextofimagedenoising,theencoderfacilitates
the understanding of global noise patterns, while the decoder reconstructs noise-free images by restoring
local ne-grained features. Additionally, U-Net incorporates skip connections that directly link layers in the
encoder with their corresponding layers in the decoder, thereby enabling the model to recover ne spatial
details that may be lost during the downsampling process.
In this study, we implement a CNN-based U-Net and optimize it through Mean Squared Error (MSE)
loss for reconstruction evaluation. For a given noisy image x, the U-Net generates an output image ˆ
ysuch
that ˆ
y∼y,whereyrepresents the expected, original malware image without noise. e model minimizes
the discrepancy between the reconstructed image ˆ
yand the ground truth y. e denoising loss function is
the MSE between the predicted clean image ˆ
yand the ground truth clean image y.eMSEisgivenby:
MSE =
N
N
∑
i=
(ˆ
yi−yi)()
where Ndenotes the total number of pixels. rough optimization, the MSE loss can minimize the pixel-wise
error across the entire image.
Fig. presents the proposed U-Net architecture. e major components are encoding modules,
decoding modules, skip connections, and the nal convolution layer for shape adjustment.
Figure 2: e model architecture of proposed U-Net
Encoding module captures spatial context and semantic features by progressively reducing spatial reso-
lution. Within each module, convolutions are applied to decrease spatial resolution while increasing feature
depth, followed by batch normalization and Leaky ReLU activation. e activation function introduces non-
linearity with a slight allowance for negative values, controlled by a negative slope of .. is module is
employed to contract images while preserving key features and structural information.
Comput Mater Contin. ;()
Decoding module reconstructs the input by progressively increasing spatial resolution while merging
features from the encoder through skip connections. First, a transposed convolution layer is employed for
featureupscaling.Next,aLeakyReLUactivationisapplied.Adropoutlayeristhenincorporatedtorandomly
deactivate neurons, thereby introducing regularization and preventing overtting. is design facilitates the
expansion of features in the decoder path while enhancing robustness through regularization.
e U-Net is constructed in a modular format, incorporating distinct units for both convolution and
transposed convolution layers. To achieve downsampling, ve encoding modules are implemented, which
halve the spatial resolution while simultaneously increasingthefeaturedepth.Inthiscontext,thefeature
depth progresses as follows: , , , , and . e output feature maps from each block are retained
to enable skip connections. For the upscaling procedure, ve decoding modules are employed to double the
spatial resolution while reducing the feature depth to match the input size. In a reverse manner, the feature
depth decreases in the following sequence: ----. Feature maps from the corresponding
encoding blocks are concatenated at each stage to preserve spatial details. Finally, a × convolution is
utilized to map the nal output to the desired number of channels (e.g., three for RGB images).
e design incorporates reusable and exible modules, thereby enhancing adaptability to various
image dimensions. e activation function employed, Leaky ReLU, mitigates the issue of dying ReLU,
which occurs when neurons become inactive due to all-zero gradients, thereby promoting more dynamic
training. Additionally, both batch normalization and dropout techniques contribute to stabilizing the
training process and enhancing generalization. e symmetry between the encoder and decoder, along
with the implementation of skip connections, ensures that the reconstruction phase eectively eliminates
unwanted noise while preserving the essential features learned in each encoding module.
To enhance the resilience of these methods, several potential improvements can be implemented: ()
employing obfuscated les for pre-classication analysis to develop discriminative features that eectively
distinguish benign applications from malware; () leveraging adversarial ML techniques to simulate attacks
on classication models using obfuscated malware samples or images augmented with random noise,
which can mimic obfuscation strategies. Furthermore, extensive research in information-hiding techniques
could be integrated into imagery-based malware analysis, utilizing methods such as patch-line and fuzzy
clustering-line to uncover hidden content or anomalies within the images.
3.3 Training with GAN Framework
In the context of combating noise injection, using an encoder-decoder structure is feasible, but
implementing GAN framework with a simple discriminator can enhance the denoising performance due to
the adversarial training paradigm.
e primary distinction lies in the optimization of the loss function. When utilizing U-Net in isolation,
the focus is on enhancing the noise robustness of models during training. is approach prioritizes
minimizing the reconstruction error between generated images and their corresponding expected images,
without incorporating noise injection. Consequently, while the model gains the ability to handle noisy data, it
lacks generative capabilities; specically, it cannot be assured of producing realistic samples when compared
to noise-free data. In contrast, a GAN designed for denoising tasks explicitly comprises two networks:
a generator that produces denoised data and a discriminator that dierentiates between real (clean) and
generated (denoised) data. ese networks are trained adversarially, leading to a model that learns the
underlying data distribution and can reconstruct highly accurate outputs, even in high-noise environments.
rough this adversarial training framework, the model can perform denoising in a nuanced and adaptive
manner-an achievement that pure error-based strategies oen struggle to attain [].
Comput Mater Contin. ;()
Fig. illustrates the complete GAN framework and the model structure of the discriminator. Fol-
lowing the input conguration, a convolutional block is employed for feature extraction, consisting of
two -dimensional convolutional layers with units, batch normalization, and Leaky ReLU activation.
Subsequently, an adaptive average pooling layer is applied to reduce the size of the feature map. A fully
connectedlayerwithunitsisthenutilizedtoattenthelearnedfeatures.isisfollowedbyaLeaky
ReLU activation, leading to an output layer that predicts the probability of the input being real or fake.
e decision to employ a simplied discriminator within our GAN framework is driven by the objective of
stabilizing the training process while prioritizing the generator’s capabilities. A less complex discriminator
reduces the likelihood of overtting and allows for a more balanced learning dynamic between the generator
and discriminator. is approach facilitates a more eective training regime, enabling the generator to
develop robust representations without being excessively constrained by the discriminator’s judgments.
Figure 3: e proposed GAN framework and model structure for discriminator
During the training process, there are two losses to be optimized: generator loss and discriminator loss,
which can be represented as Eqs. () and ().
LG=E[MSE(G(x),y)] ()
LD=−
E[log(D(y)) + log(−D(G(x)))] ()
In which G(x)represents the generated image, y is the target image, D(y)is the discriminator’s output
for real images and D(G(x)) is the discriminator’s output for generated images.
Before training, we prepared a dataset comprising paired original and noisy images containing malware
patterns, wherein the noisy images serve as input and the corresponding noise-free images function as the
generator’s output. e training workow begins with the initialization phase, where the U-Net generator
and discriminator are dened. e optimizer employed is AdamW [], initialized with a learning rate of
.. To progressively adjust the learning rate during training and mitigate overtting, the Cosine Annealing
learningrateschedulerisutilized.isalgorithmeectivelyreducesthelearningrateinasmoothmanner,
mimicking the natural traversal of the optimization landscape. Additionally, it enables ecient computation
by employing an automatic adaptive schedule and reduces the need for extensive hyperparameter tuning.
Utilizing an initial learning rate of ., a minimum learning rate of ., and a training epoch of ,
the learning rate adjustment curve is depicted in Fig. .
Comput Mater Contin. ;()
Figure 4: e learning rate adjusted by scheduler
e complete training process is outlined in Algorithm . Following the initialization of the generator
and discriminator, the training algorithm proceeds iteratively through the following steps: training the
generator, training the discriminator, validating model performance, and updating the learning rates.
e primary tasks of the generator involve generating fake images to produce denoised versions of
noisy inputs, receiving feedback from the discriminator regarding classication performance on both fake
and real images, and optimizing based on the MSE loss. For the discriminator, the loss is composed of two
components:thelossassociatedwithclassifyingrealimages as “” and the loss for classifying fake images as
“.” e mathematical expressions for these are detailed below:
real_loss =criterion(real_output, ones_like(real_output)) ()
fake_loss =criterion(fake_output, zeros_like(fake_output)) ()
dloss =real_loss +fake_loss
()
e adversarial objective dened as Eq. () is highly eective for denoising tasks because it facilitates
learning a robust mapping between noisy input data and the clean underlying distribution.
min
Gmax
DV(D,G)=Ex∼pdata(x)[log D(x)] + Ez∼pz(z)[log(−D(G(z)))] ()
Algorithm 1: Training algorithm for GAN
: Input & Dened components:
•x: Set of noisy malware images
•G: Generator network (UNetModel) initialized with parameters θG
•D: Discriminator network initialized with parameters θD
•ηG,ηD: Learning rates for Gand D
• Optimizer: AdamW for both Gand D
• Scheduler: Cosine Annealing Learning Rate scheduler
• num_epochs: Number of training epochs
(Continued)
Comput Mater Contin. ;()
Algorithm 1 (continued)
: Output:
• Trained generator network G
• Trained discriminator network D
: for epoch = to num_epochs do
: Generator Training Phase:
• Generate fake images: ˆ
x←G(x)
• Compute generator loss: LG←GAN_Loss_Generator(D(ˆ
x)) + Reconstruction_Loss(ˆ
x,x)
• Update generator parameters: θG←θG−ηG⋅∇
θGLG
: Discriminator Training Phase:
• Classify real and fake images: Dreal (x)and Dfake(ˆ
x)
• Compute discriminator loss: LD←−E[log(Dreal(x))] − E[log(−Dfake (ˆ
x))]
• Update discriminator parameters: θD←θD−ηD⋅∇
θDLD
: Apply learning rate scheduling:
• Update learning rates for Gand Dusing cosine annealing scheduler.
: end for
In the adversarial framework, the generator aims to produce outputs that are indistinguishable from
true clean images, while the discriminator seeks to dierentiate between real and generated outputs. rough
adversarial training, the generator progressively learns to model the underlying clean data distribution.
is objective is eective for denoising tasks, as the generator not only minimizes pixel-level dierences
(e.g., MSE loss) but also captures higher-order structures that align the generated outputs with the true
clean image distribution. Furthermore, although we simulate two representative types of noise, adversarial
learning does not depend on explicit assumptions about the noise distribution. Instead, it has the capacity
to eliminate discrepancies between the input noisy data and their corresponding clean counterparts,
demonstrating its strong generalizability. Overall, the adversarial objective achieves a balance between
realism and accuracy, making it especially advantageous for image-based denoising tasks where traditional
methods oen fall short.
4 Experiments
4.1 Datasets
We tested the proposed denoising technique on two malware imagery datasets in a multi-classication
environment to fully validate the suggested techniques. e rst dataset, Big [], was proposed by
Microso and includes , malware samples represented as disassembled assembly code and byte-level
features from distinct malware families. A class label, a -character hash value, and an identier are used
to describe each le. Fig. presents the malware types, class labels and their proportions in the training and
testing sets. e second dataset, MALIMG [], was proposed for research on visualizing malware imagery.
MALIMG contains dierent malware families, which provides a foundation for multi-class classication
tasks. is diversity reects, to some extent, the variability in malware behaviors and families encountered
in practice. Fig. presents the class labels and their initial proportions in the training set. Both datasets
represent, to some extent, the challenges associated with static malware analysis, particularly in the detection
and classication of binary les with established structures from previously identied malware families.
Although these datasets do not capture the behavioral aspects of malware (dynamic analysis) in real-world
Comput Mater Contin. ;()
scenarios, they encompass a diverse array of malware samples and are among the most widely used resources
for malware imagery analysis.
Figure 5: Data distribution in BIG
Comput Mater Contin. ;()
Figure 6: Data distribution in MALIMG
Comput Mater Contin. ;()
4.2 Experiment Settings
e two datasets are divided into training, validation, and test sets by the authors. In our experiment,
we followed three main settings to evaluate the performance of the classication and denoising models. First,
we used the original training and test sets to evaluate various classiers and measure their performance.
Second, noise injection was applied to the training set, aer which the classiers were trained on the noisy
data and their performance was assessed using the original test set. Lastly, the primary experiment involved
employing denoising techniques. In this setting, noise was injected into the training set and a GAN-based
denoising model was implemented. e training and validation sets were used to train and validate the
denoising model, respectively. e pre-trained generator of the GAN was then used as a denoising model,
wherein the training set with noise injection served as the input data, and the test set was used to evaluate
classication performance. e test data remained completely unseen by the denoising model throughout
the training process while the validation set was fully utilized to validate the model’s performance.
In this study, we employ four distinct CNN models as classiers for an image-based malware detection
task. ese models were selected because they are widely used in the eld and represent diverse types of CNN
architectures with distinct design philosophies: DenseNet [] emphasizes feature reuse through densely
connected layers, EcientNet [] optimizes performance with compound scaling, MobileNet []focuses
on lightweight architectures for mobile and resource-constrained devices, and ResNet [] utilizes skip
connections to address vanishing gradient issues. is diversity ensures a comprehensive evaluation of noise
resilience across dierent CNN approaches.
e primary objective of this study is to assess the impact of noise injection on the performance of these
models and to evaluate the eectiveness of the proposed denoising technique. By comparing the models’
accuracy before noise injection, aer noise injection, and following the application of the denoising method,
we aim to validate the denoising technique and to understand how noise aects dierent CNN-based
classiers. To adapt the pre-trained models to our specic classication task, we modify their architectures
by replacing the nal layers with a customized classier head. Generally, the attached layers consist of a
linear feed-forward layer with units, a ReLU activation function, a dropout layer with a rate of %,
and a classication output layer. Given that the pre-trained weights are learned from ImageNet, a general
multi-class image dataset rather than malware images, we aim to enhance generalization to our task by
incorporating a dropout layer. Metrics including accuracy, precision, recall, and F-score are selected for
performance evaluation.
4.3 Results and Discussion
e following analysis evaluates the performance of four CNN-based models across three experimental
settings with original images and noise injections, focusing on the models’ performances on the test set and
the resilience to noise. In this section, all tables presenting the experimental results highlight the highest
value of each evaluation metric in bold for clarity and ease of comparison.
Table presents the experiments on the original data and with noise injection for BIG. e
performance on the original images serves as the baseline for comparison. e eect of noise injections
is signicant. ResNet outperforms all other models with the highest accuracy (.%), precision (.%),
recall (.%), and F-score (.%). is showcases its strong capability in malware classication on clean
data. DenseNet and MobileNet closely follow, with comparable test accuracies (.%–.%) and F-scores.
EcientNet achieves a slightly lower performance compared to its counterparts at .% accuracy on the
test set, which indicates slightly reduced eectiveness on clean data.
Comput Mater Contin. ;()
Gaussian noise introduces random variations throughout the image, thereby degrading the model’s
classication accuracy. e results indicate a signicant decline in performance across all models when
evaluated on the test set. On the training set, except for ResNet, all other models demonstrate reduced
accuracy, recall, precision, and F-score. Notably, ResNet, which performed exceptionally well on clean data,
experiences the most substantial drop on the test set, with its accuracy plummeting to .%. is nding
suggests that, despite its robust capabilities with clean inputs, ResNet exhibits the least resilience to Gaussian
noise. Overall, Gaussian noise leads to a pronounced deterioration in classication performance, with all
models experiencing considerable losses in their original accuracy. Both ResNet and DenseNet, despite their
strong performances on clean data, display inadequate resilience to this type of noise.
Table 1: Performance metrics with/without noise injection on BIG
Train Test
Original Accuracy Precision Recall F1_score Accuracy Precision Recall F1_score
DenseNet . . . . . . . .
EcientNet . . . . . . . .
MobileNet . . . . . . . .
ResNet 0.985 0.985 0.985 0.985 0.965 0.966 0.965 0.965
Gaussian noise Accuracy Precision Recall F_score Accuracy Precision Recall F_score
DenseNet . . . . . . . .
EcientNet . . . . 0.280 0.325 0.280 0.250
MobileNet . . . . . . . .
ResNet 0.975 0.975 0.975 0.975 . . . .
Speckle noise Accuracy Precision Recall F_score Accuracy Precision Recall F_score
DenseNet . . . . 0.576 0.723 0.576 0.507
EcientNet . . . . . . . .
MobileNet . . . . . . . .
ResNet 0.982 0.982 0.982 0.982 . . . .
Speckle noise, characterized by multiplicative noise that is associated with image features, exerts a
somewhat less detrimental eect than Gaussian noise. DenseNet achieves a test accuracy of .%, exhibiting
notable improvements in precision (.%) and F-score (.%) compared to its performance under
Gaussian noise. is suggests that DenseNet demonstrates moderate resilience to speckle noise in contrast
to Gaussian noise. Both EcientNet and MobileNet experience moderate declines in performance. While
ResNet maintains high performance on the training set and achieves an improved test accuracy of .%,
it exhibits diminished eectiveness under speckle noise relative to its near-optimal performance on the
original clean data. e observed precision, recall, and F-scores indicate that ResNet encounters challenges
in sustaining robust classication in the presence of this type of noise.
Table presents the performance metrics aer the application of the denoising model. is analysis
centers on assessing the ecacy of denoising techniques as an integral component of the preprocessing
pipeline for malware classication under various noisy data conditions. Overall, the incorporation of
denoising techniques results in a signicant enhancement in classication performance compared to the
outcomes derived from raw noisy data, applicable to both Gaussian and Speckle noise.
Comput Mater Contin. ;()
Notably, the extent of improvement is contingent upon the specic models employed and the type of
noise present; the eectiveness is markedly greater for Speckle noise, as evidenced by superior test accuracies
and F-scores when compared to Gaussian noise. Additionally, the degree of enhancement is inuenced
by the architecture of the classier. For example, in the context of Gaussian noise, MobileNet exhibits
the most substantial improvement among all models in terms of accuracy, with the F-score increasing
signicantly to .%. is suggests that MobileNet derives the greatest benet from denoising, both in
terms of class balance and overall performance. Conversely, models such as EcientNet and DenseNet
demonstrate moderate benets, while ResNet continues to experience considerable challenges with Gaussian
noise, resulting in relatively limited improvements from denoising. In contrast, for Speckle noise, all metrics
across various models exceed %, indicating a consistent enhancement in performance.
Table 2: Performance metrics with denoising model on MALIMG
Gaussian +Denoising Train Test
Accuracy Precision Recall F1_score Accuracy Precision Recall F1_score
DenseNet . . . . . . . .
EcientNet . . . . . . . .
MobileNet . . . . 0.864 0.867 0.864 0.852
ResNet 0.983 0.981 0.983 0.982 . . . .
Speckle +Denoising Accuracy Precision Recall F_score Accuracy Precision Recall F_score
DenseNet . . . . 0.888 . 0.888 .
EcientNet . . . . . 0.872 . 0.877
MobileNet . . . . . . . .
ResNet 0.985 0.984 0.985 0.984 . . . .
Table presents the experiments on original data and with noise injection for MALIMG. With labels
to predict, the task is more complex than BIG. Performance on the original dataset provides the baseline
for comparison. DenseNet shows the best test performance with an accuracy of .%, precision of .%,
and a balanced F-score of .%. is demonstrates its eectiveness for malware classication in clean and
complex datasets. MobileNet and ResNet perform moderately, and EcientNet slightly underperforms, but
all models have excellent performance on both training and test set.
Table 3: Performance metrics with/without noise injection on MALIMG
Train Test
Original Accuracy Precision Recall F1_score Accuracy Precision Recall F1_score
DenseNet 0.989 0.989 0.989 0.989 0.985 0.985 0.985 0.985
EcientNet . . . . . . . .
MobileNet . . . . . . . .
ResNet . . . . . . . .
Guassian noise Accuracy Precision Recall F_score Accuracy Precision Recall F_score
DenseNet . . . . 0.237 0.557 0.237 0.279
EcientNet . . . . . . . .
(Continued)
Comput Mater Contin. ;()
Tabl e 3 ( cont i nue d )
Train Test
Original Accuracy Precision Recall F1_score Accuracy Precision Recall F1_score
MobileNet . . . . . . . .
ResNet 0.985 0.984 0.985 0.984 . . . .
Speckle Noise Accuracy Precision Recall F_score Accuracy Precision Recall F_score
DenseNet . . . . 0.845 0.867 0.845 0.826
EcientNet . . . . . . . .
MobileNet . . . . . . . .
ResNet 0.988 0.988 0.988 0.988 . . . .
Gaussian noise induces signicant performance degradation across all models, particularly evident
in the results from the test set. Among the models evaluated, DenseNet exhibits the highest performance
under Gaussian noise injection; however, its accuracy remains low at only .%, highlighting its limited
resilience to random pixel variations. While precision is relatively high at .%, the recall drops to .%,
resulting in an F-score of .%. is suggests that DenseNet is capable of making accurate predictions for
certain classes but encounters challenges with imbalanced classication across the labels. Conversely, both
EcientNet and MobileNet experience a catastrophic decline in performance on the test set, indicating their
extreme sensitivity to Gaussian noise, as they fail to produce meaningful predictions when faced with random
noise in the input data. ResNet demonstrates slightly better performance; however, despite its strong baseline
performance on clean data, it does not exhibit resilience to Gaussian noise within the increased complexity of
the -label dataset. Overall, Gaussian noise severely aects all models, with DenseNet displaying marginally
better resilience than the others. In contrast, EcientNet and MobileNet perform poorly, with accuracy
and other metrics collapsing entirely. e increased number of labels further exacerbates the diculty, as
Gaussian noise corrupts the features essential for distinguishing between ne-grained classes.
Speckle noise, being more structured than Gaussian noise, exerts a less disruptive eect on model
performance. DenseNet exhibits the smallest impact, with test accuracy decreasing from .% (clean) to
.%, resulting in an F-score of .%. Its relatively high precision (.%) and recall (.%) suggest
that the model retains its capacity to classify most of the labels accurately, albeit with slightly diminished
condence. In contrast, EcientNet demonstrates signicant underperformance, with test accuracy falling
to .% and a low F-score of .%. MobileNet and ResNet experience moderate decreases. Overall,
speckle noise aects all models less severely than Gaussian noise. DenseNet is distinguished as the most
robust under speckle noise, achieving test accuracy comparable to that of clean datasets. MobileNet and
ResNet exhibit moderate resilience, while EcientNet performs relatively poorly.
e eects of Gaussian noise are considerably more disruptive than those of speckle noise for this
dataset, which comprises labels. is may be exacerbated by the inherent complexity associated with the
dataset’s -label structure. e random characteristics of Gaussian noise interfere with the ne-grained
patterns that are essential for distinguishing among the labels. Models such as EcientNet and MobileNet,
which are optimized for eciency, exhibit complete failure under Gaussian noise but demonstrate partial
recovery when subjected to speckle noise, where the original characteristic of images are preserved. Among
all evaluated models, DenseNet exhibits the greatest resilience in the presence of both Gaussian and speckle
noise, achieving the highest test accuracy under noisy conditions, particularly when exposed to speckle noise.
Although its performance is signicantly impaired by Gaussian noise, it still outperforms the other models.
Comput Mater Contin. ;()
Table presents the performance results following the application of the denoising process. Previ-
ously, Gaussian noise resulted in signicant performance deterioration, with performance metrics such
as accuracy, precision, recall, and F-score all being severely aected. In contrast, the denoising process
markedly enhances model performance, particularly in the presence of Gaussian noise, which induces greater
disruptions compared to speckle noise. For DenseNet, the test accuracy improves to .% with denoising,
compared to only .% without it, indicating a substantial recovery.
Table 4: Performance metrics with denoising model on MALIMG
Guassian +Denoising Train Test
Accuracy Precision Recall F1_score Accuracy Precision Recall F1_score
DenseNet . . . . . . . .
EcientNet . . . . . . . .
MobileNet . . . . 0.864 0.867 0.864 0.852
ResNet 0.983 0.981 0.983 0.982 . . . .
Speckle +Denoising Accuracy Precision Recall F_score Accuracy Precision Recall F_score
DenseNet . . . . 0.888 . 0.888 0.860
EcientNet . . . . . . . .
MobileNet . . . . . 0.878 . .
ResNet 0.985 0.984 0.985 0.984 . . . .
Additionally, precision, recall, and F-score also exhibit improvements, suggesting balanced enhance-
ments across all metrics. Similarly, EcientNet demonstrates moderate improvement. MobileNet shows a
signicant recovery in test accuracy, reaching .% in the presence of denoising, compared to a mere .%
without it, making MobileNet the most improved model under Gaussian noise conditions. Precision achieves
.%, and the F-score improves to .%, reecting strong performance recovery with a high degree of
balance. Overall, MobileNet benets the most among the evaluated models, indicating that its lightweight
architecture eectively capitalizes on the denoising process to recover lost features. However, despite the
application of denoising, ResNet’s performance remains the lowest under Gaussian noise, suggesting that
its deeper architecture struggles to generalize eectively in noisy conditions even with denoising applied.
Additional strategies, such as adversarial training, may be necessary to enhance its robustness. In conclusion,
the denoising technique proves to be highly eective in mitigating the impact of Gaussian noise; however, a
performance gap still persists when compared to the clean dataset.
Speckle noise exerts less severe impacts on performance; consequently, the recovery observed with
denoising is less dramatic yet remains signicant. For DenseNet, the test accuracy increases to .%, rep-
resenting a modest improvement compared to the performance without denoising (.%). Nevertheless,
metrics such as precision (.%) and F-score (.%) demonstrate balanced enhancements. DenseNet
achieves the highest performance among the evaluated models, highlighting its resilience and adaptability
when utilized in conjunction with denoising techniques. EcientNet, MobileNet, and ResNet also exhibit
notable improvements across all metrics, underscoring the eectiveness of the denoising model.
From the perspective or no of noise injection, the performance decrement varies for dierent noises,
which also leads to the dierent degree of improvement of denoising model. Denoising was dramatically
eective under Gaussian noise, where all models saw signicant performance recoveries. Since Speckle
noise was less destructive to begin with, the impact of denoising was less pronounced but still important
Comput Mater Contin. ;()
for classication accuracy. For model-specic observations, DenseNet performed consistently well in both
Gaussian and Speckle noise scenarios, making it the most robust model under noisy conditions with
denoising. EcientNet and MobileNet fall behind but also show exceptional recovery aer applying the
denoising technique. ResNet, despite being a high-performing model on clean data, still struggles the most
with Gaussian noise even aer denoising. Its performance was better under Speckle noise, suggesting it
benets from structured noise more than random distortions.
4.4 Implementation and Future Enhancements
While the experiments have demonstrated the eectiveness of the denoising technique, this discussion
aims to address the implementation, application, and future enhancements of such a denoising model. e
U-Net denoising model, originally developed within a Generative Adversarial Network (GAN) framework,
provides a sophisticated approach to image denoising that holds value in cybersecurity and network-based
image processing applications. A key advantage of this model is its capacity to function as a standalone
denoising preprocessor, utilizing pre-trained weights without necessitating the comprehensive training
process associated with the full GAN architecture. Assuming a batch of input data is of size (, H, W), the
general structure of the proposed U-Net is outlined in Tab l e .
Table 5: Implementation details of U-Net modules
Structure Components Output size of the block
Encoder “Down” blocks ([, H/], [, H/], [, H/], [, H/], [, H/])
Decoder “Up” blocks ([, H/], [, H/], [, H/], [, H/], [, H/])
Output Final_convolution (, H, W)
ere are several potential applications for the pre-trained denoising model in a production envi-
ronment. e denoising model can be deployed as a standalone micro-service utilizing containerization
technologies such as Docker and Kubernetes, allowing for seamless integration into image processing
pipelines prior to executing classication tasks. Additionally, the model can be implemented on edge devices
equipped with GPU acceleration to facilitate real-time image processing or to enable distributed image
denoising across network endpoints. When employed in production environments, the preprocessing stage
can be structured as follows: generation of malware image matrices, normalization and standardization of
image pixel values, denoising of images, classication, and metric monitoring.
e pre-trained U-Net can be seamlessly integrated into network security applications for the pre-
processing of noisy malware visualization datasets, thereby enhancing feature extraction accuracy and
improving the robustness of classication models. Continuous maintenance can be achieved by monitoring
denoising performance metrics and establishing alerts for performance degradation. Setting up automated
retraining pipelines is also a feasible solution if it is without computational constraints.
While the proposed denoising technique has demonstrated its eectiveness, there are several potential
challenges associated with real-world applications. Although the pre-trained U-Net weights have been
successfully tested on two distinct malware imagery datasets, indicating a degree of generalizability, ne-
tuning and optimization will likely be required to adapt the model to specic deployment environments. In
addition, incorporating an extra denoising model introduces additional processing time, which may extend
the overall malware detection process. ough Gaussian noise captures fundamental characteristics that may
overlap with various noise types, and speckle noise represents certain variations, the pre-trained models
may still struggle to handle unfamiliar or unseen noise patterns eectively. Furthermore, the model may
Comput Mater Contin. ;()
not be suitable for devices with resource constraints, such as those with limited memory, computational
capacity, or lack of robust hardware acceleration. From a data perspective, malware datasets oen suer from
imbalanced distributions among malware types or families, potentially leading to biased training, where the
model performs well for dominant classes but poorly for minority classes. is imbalance can further impede
the model’s ability to generalize eectively.
Building on the proposed denoising model, several additional directions warrant exploration to further
enhance its denoising capabilities. Firstly, incorporating more diverse noise types could signicantly improve
the model’s robustness. Salt-and-pepper noise, characterized by the introduction of random white and black
pixels into an image or data stream, can represent scenarios in which malware authors insert extraneous data
or corrupt specic sections of a binary to obfuscate detection mechanisms. Quantization noise, resulting
from the mapping of continuous values to discrete levels, can simulate scenarios where malware undergoes
compression or encoding. Poisson noise, which involves uctuations proportional to the signal intensity,
can mimic certain types of dynamic behavior in malware, particularly those that modify their own code in
response to environmental conditions. Phase noise, characterized by rapid, short-term variations in the phase
that compromise the integrity of the data, can emulate phase manipulation techniques employed by malware
to disrupt patterns upon which detection algorithms depend. Moreover, testing with synthetic mixed-noise
scenarios that combine multiple noise types would be valuable for simulating more realistic environments
and improving its ability to generalize to unseen noise patterns []. To further optimize the performance of
the U-Net GAN, advanced optimization strategies can be adopted. While dynamic learning rate schedules
have been employed in the current work, modifying the loss function by introducing perceptual loss or
incorporating adversarial losses has been shown to produce superior denoising results, as demonstrated in
prior studies []. Finally, diusion models, renowned for their remarkable performance in image-based
generative tasks, progressively rene noisy data over multiple iterations, making them highly suitable for
complex denoising scenarios. ese models have shown great potential to achieve high-quality restoration
even in severe noise conditions []. However, given the computational intensity associated with noise
generation in each training step, a practical approach would involve reducing the number of timesteps in
the diusion process or leveraging a subset of renement stages specically tailored to the characteristics
of malware imagery. For practical implementation of the denoising technique, it is advisable to containerize
the denoising service to facilitate convenient usage and to congure a scalable cloud or edge infrastructure.
Additionally, it is essential to implement continuous and timely performance tracking to ensure optimal
operation and to promptly address any potential issues.
5 Conclusion
When formulating malware classication as an image analysis task, it is crucial to consider obfuscation
simulation, implement noise injection to replicate this process, and develop eective techniques to mitigate
such noise. is paper proposes a U-Net GAN-based denoising model, which serves as a powerful and
adaptable solution for image preprocessing. By carefully addressing the deployment architecture and ensur-
ing generalizability across dierent noise types, this technology can substantially enhance image processing
pipelines across various domains. Furthermore, we investigate the noise resistance capabilities of dierent
CNN-based classiers; key ndings indicate that DenseNet exhibits strong resistance to noise, while ResNet,
although outperforming other models on clean data, demonstrates the least resilience to Gaussian noise.
Overall, this research conducts a comparative study on the performance of various CNN-based classiers
under dierent noise injection conditions and proposes denoising techniques applicable to security domains.
is work contributes to enhancing cybersecurity by developing an ecient system for recovering essential
characteristics of malware imagery, thereby improving the model’s predictive performance.
Comput Mater Contin. ;()
Acknowledgement: We sincerely thank the reviewers and the editor for their valuable time, eort, and constructive
feedback, which greatly contributed to improving the quality of this work.
Funding Statement: is research is partially funded by the budget project FFZF--.
Author Contributions: e authors conrm contribution to the paper as follows: study conception and design:
Huiyao Dong, Igor Kotenko; data collection: Huiyao Dong; analysis and interpretation of results: Huiyao Dong;
manuscript preparation: Huiyao Dong, Igor Kotenko. All authors reviewed the results and approved the nal version
of the manuscript.
Availability of Data and Materials: Data openly available in public repositories. e data that support
the ndings of this study are openly available at Kaggle: Malware Classication https://www.kaggle.com/c/
malware-classication/data (accessed on January ) and Google Drive File https://drive.google.com/le/d/
MVzyIQj_kuEXzhClGKTZWhT_pr-/view (accessed on January ).e Python code developed for this
research can be found at the following repository: https://github.com/hydongmeow/denoising_malware (accessed on
January ).
Ethics Approval: Not applicable.
Conicts of Interest: e authors declare no conicts of interest to report regarding the present study.
References
. SonicWall. Mid-year cyber threat report; . [cited Nov ]. Available from: https://www.sonicwall.
com/-Mid- Year-Cyber-reat-Report.asp.
. Luo JS, Lo DCT. Binary malware image classication using machine learning with local binary pattern. In:
IEEE International Conference on Big Data (Big Data) ; ; Boston, MA, USA. p. –.
. Azab A, Khasawneh M. MSIC: malware spectrogram image classication. IEEE Access. ;:–. doi:.
/ACCESS...
. Dong H, Kotenko I. Image-based malware analysis for enhanced IoT security in smart cities. Int ings.
;():. doi:./j.iot...
. Bensaoud A, Kalita J, Bensaoud M. A survey of malware detection using deep learning. Mach Learn Appl.
;:. doi:./j.mlwa...
. Kumar S, Janet B. DTMIC: deep transfer learning for malware image classication. J Inf Secur Appl.
;:. doi:./j.jisa...
. PhamDP,MarionD,MastioM,HeuserA.Obfuscationrevealed:leveragingelectromagneticsignalsforobfuscated
malware classication. In: Proceedings of the th Annual Computer Security Applications Conference; ; New
York, NY, USA. p. –.
. Dong H, Kotenko I. VAE-GAN for Robust IoT malware detection and classication in intelligent urban environ-
ments: an image analysis approach. In: International Conference on Risks and Security of Internet and Systems;
;Cham:Springer.p.–.
. Bensaoud A, Kalita J. Deep multi-task learning for malware image classication. J Inf Secur Appl. ;:.
doi:./j.jisa...
. Lee H, Kim S, Baek D, Kim D, Hwang D. Robust IoT malware detection and classication using opcode category
features on machine learning. IEEE Access. ;:–. doi:./ACCESS...
. BhatP,BehalS,DuttaK.Asystemcall-basedandroidmalwaredetectionapproachwithhomogeneous&
heterogeneous ensemble machine learning. Comput Secur. ;:. doi:./j.cose...
. Chaganti R, Ravi V, Pham TD. Image-based malware representation approach with EcientNet convolutional
neural networks for eective malware classication. J Inf Secur Appl. ;:. doi:./j.jisa...
. DaoTV,SatoH,KuboM.AnattentionmechanismforcombinationofCNNandVAEforimage-basedmalware
classication. IEEE Access. ;:–. doi:./ACCESS...
Comput Mater Contin. ;()
. akur P, Kansal V, Rishiwal V. Hybrid deep learning approach based on LSTM and CNN for malware detection.
Wirel Pers Commun. ;():–. doi:./s---y.
. Karat G, Kannimoola JM, Nair N, Vazhayil A, Sujadevi V, Poornachandran P. CNN-LSTM hybrid model
for enhanced malware analysis and detection. Procedia Comput Sci. ;:–. doi:./j.procs..
..
. Dong H. Convolutional-free malware image classication using self-attention mechanisms. Inform Autom.
;():–. doi:./ia....
. Tian C, Fei L, Zheng W, Xu Y, Zuo W, Lin CW. Deep learning on image denoising: an overview. Neural Netw.
;:–. doi:./j.neunet....
. Su Y, Lian Q, Zhang X, Shi B, Fan X. Multi-scale cross-path concatenation residual network for Poisson denoising.
IET Image Process. ;():–. doi:./iet-ipr...
. Jifara W, Jiang F, Rho S, Cheng M, Liu S. Medical image denoising using convolutional neural network: a residual
learning approach. J Supercomput. ;:–. doi:./s---.
. Yuan Q, Zhang Q, Li J, Shen H, Zhang L. Hyperspectral image denoising employing a spatial-spectral deep residual
convolutional neural network. IEEE Trans Geoscience Remote Sens. ;():–. doi:./TGRS..
.
. Li S, Wen H, Deng L, Zhou Y, Zhang W, Li Z, et al. Denoising network of dynamic features for enhanced malware
classication. In: IEEE International Performance, Computing, and Communications Conference (IPCCC);
; Los Alamitos, CA, USA. p. –.
. Kumi S, LeeSuk -Ho. BMD and deep image prior based denoising for the defense against adversarial attacks on
malware detection networks. Int J Adv Smart Converg. ;():–. doi:./IJASC.....
. Taheri R, Shojafar M, Arabikhan F, Gegov A. Unveiling vulnerabilities in deep learning-based malware detection:
dierential privacy driven adversarial attacks. Comput Secur. ;:. doi:./j.cose...
. Popescu AB, Taca IA, VizitiuA, Nita CI, Suciu C, Itu LM, et al. Obfuscation algorithm for privacy-preserving deep
learning-based medical image analysis. Appl Sci. ;():. doi:./app.
. Boncelet C. Image noise models. In: e essential guide to image processing. USA: Elsevier; . p. –.
. Maity A, Pattanaik A, Sagnika S, Pani S. A comparative study on approaches to speckle noise reduction in images.
In: International Conference on Computational Intelligence and Networks; ; Odisha, India. p. –.
. Racine R, Walker GA, Nadeau D, Doyon R, Marois C. Speckle noise and the detection of faint companions. Publ
Astron Soc Pac. ;():. doi:./.
. Fan CM, Liu TJ, Liu KH. SUNet: swin transformer UNet for image denoising. In: IEEE International
Symposium on Circuits and Systems (ISCAS); ; Austin, TX, USA: IEEE. p. –.
. Lu Z, Chen Y. Single image super-resolution based on a modied U-net with mixed gradient loss. Signal, Image
Video Process. ;():–. doi:./s---.
. Ahmad Z, Jari ZuA, Chen M, Bao S. Understanding GANs: fundamentals, variants, training challenges,
applications, and open problems. Multimed Tools Appl. . doi:./s---y.
. Loshchilov I, Hutter F. Decoupled weight decay regularization. arXiv:.. .
. Ronen R. Microso malware classication challenge. arXiv:.. .
. Nataraj L, Karthikeyan S, Jacob G, Manjunath BS. Malware images: visualization and automatic classication. In:
Proceedings of the th International Symposium on Visualization for Cyber Security; ; New York, NY, USA.
p. –.
. Huang G, Liu Z, Van Der Maaten L, Weinberger KQ. Densely connected convolutional networks. In: Proceedings
of the IEEE Conference on Comput Vis Pattern Recognit; ; Honolulu, HI, USA. p. –.
. Tan M, Le Q. Ecientnet: rethinking model scaling for convolutional neural networks. In: International Confer-
ence on Machine Learning; ; Long Beach, CA, USA: PMLR. p. –.
. Sandler M, Howard A, Zhu M, Zhmoginov A, Chen LC. Mobilenetv: inverted residuals and linear bottlenecks.
In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; ; Salt Lake City, UT,
USA. p. –.
Comput Mater Contin. ;()
. He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition; ; Las Vegas, NV, USA. p. –.
. Zhang K, Zuo W, Chen Y, Meng D, Zhang L. Beyond a gaussian denoiser: residual learning of deep CNN for image
denoising. IEEE Trans Image Process. ;():–. doi:./TIP...
. Ledig C, eis L, Huszár F, Caballero J, Cunningham A, Acosta A, et al. Photo-realistic single image super-
resolution using a generative adversarial network. In: Proceedings of the IEEE Conference on Computer Vision
and Pattern Recognition; ; Honolulu, HI, USA. p. –.
. Saharia C, Ho J, Chan W, Salimans T, Fleet DJ, Norouzi M. Image super-resolution via iterative renement. IEEE
Trans Pattern Analysis Mach Intell. ;():–. doi:./TPAMI...
Available via license: CC BY 4.0
Content may be subject to copyright.