February 2025
·
1 Read
·
1 Citation
Pattern Recognition
This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.
February 2025
·
1 Read
·
1 Citation
Pattern Recognition
January 2025
·
1 Read
The success of deep learning (DL) is often achieved with large models and high complexity during both training and post-training inferences, hindering training in resource-limited settings. To alleviate these issues, this paper introduces a new framework dubbed ``coded deep learning'' (CDL), which integrates information-theoretic coding concepts into the inner workings of DL, to significantly compress model weights and activations, reduce computational complexity at both training and post-training inference stages, and enable efficient model/data parallelism. Specifically, within CDL, (i) we first propose a novel probabilistic method for quantizing both model weights and activations, and its soft differentiable variant which offers an analytic formula for gradient calculation during training; (ii) both the forward and backward passes during training are executed over quantized weights and activations, eliminating most floating-point operations and reducing training complexity; (iii) during training, both weights and activations are entropy constrained so that they are compressible in an information-theoretic sense throughout training, thus reducing communication costs in model/data parallelism; and (iv) the trained model in CDL is by default in a quantized format with compressible quantized weights, reducing post-training inference and storage complexity. Additionally, a variant of CDL, namely relaxed CDL (R-CDL), is presented to further improve the trade-off between validation accuracy and compression though requiring full precision in training with other advantageous features of CDL intact. Extensive empirical results show that CDL and R-CDL outperform the state-of-the-art algorithms in DNN compression in the literature.
January 2025
·
6 Reads
Inverse problems are prevalent across various disciplines in science and engineering. In the field of computer vision, tasks such as inpainting, deblurring, and super-resolution are commonly formulated as inverse problems. Recently, diffusion models (DMs) have emerged as a promising approach for addressing noisy linear inverse problems, offering effective solutions without requiring additional task-specific training. Specifically, with the prior provided by DMs, one can sample from the posterior by finding the likelihood. Since the likelihood is intractable, it is often approximated in the literature. However, this approximation compromises the quality of the generated images. To overcome this limitation and improve the effectiveness of DMs in solving inverse problems, we propose an information-theoretic approach. Specifically, we maximize the conditional mutual information , where represents the reconstructed signal, is the measurement, and is the intermediate signal at stage t. This ensures that the intermediate signals are generated in a way that the final reconstructed signal retains as much information as possible about the measurement . We demonstrate that this method can be seamlessly integrated with recent approaches and, once incorporated, enhances their performance both qualitatively and quantitatively.
December 2024
·
2 Reads
Inverse problems exist in many disciplines of science and engineering. In computer vision, for example, tasks such as inpainting, deblurring, and super resolution can be effectively modeled as inverse problems. Recently, denoising diffusion probabilistic models (DDPMs) are shown to provide a promising solution to noisy linear inverse problems without the need for additional task specific training. Specifically, with the prior provided by DDPMs, one can sample from the posterior by approximating the likelihood. In the literature, approximations of the likelihood are often based on the mean of conditional densities of the reverse process, which can be obtained using Tweedie formula. To obtain a better approximation to the likelihood, in this paper we first derive a closed form formula for the covariance of the reverse process. Then, we propose a method based on finite difference method to approximate this covariance such that it can be readily obtained from the existing pretrained DDPMs, thereby not increasing the complexity compared to existing approaches. Finally, based on the mean and approximated covariance of the reverse process, we present a new approximation to the likelihood. We refer to this method as covariance-aware diffusion posterior sampling (CA-DPS). Experimental results show that CA-DPS significantly improves reconstruction performance without requiring hyperparameter tuning. The code for the paper is put in the supplementary materials.
December 2024
·
3 Reads
Dataset distillation (DD) aims to minimize the time and memory consumption needed for training deep neural networks on large datasets, by creating a smaller synthetic dataset that has similar performance to that of the full real dataset. However, current dataset distillation methods often result in synthetic datasets that are excessively difficult for networks to learn from, due to the compression of a substantial amount of information from the original data through metrics measuring feature similarity, e,g., distribution matching (DM). In this work, we introduce conditional mutual information (CMI) to assess the class-aware complexity of a dataset and propose a novel method by minimizing CMI. Specifically, we minimize the distillation loss while constraining the class-aware complexity of the synthetic dataset by minimizing its empirical CMI from the feature space of pre-trained networks, simultaneously. Conducting on a thorough set of experiments, we show that our method can serve as a general regularization method to existing DD methods and improve the performance and training efficiency.
November 2024
·
1 Citation
October 2024
·
6 Reads
Although it is traditionally believed that lossy image compression, such as JPEG compression, has a negative impact on the performance of deep neural networks (DNNs), it is shown by recent works that well-crafted JPEG compression can actually improve the performance of deep learning (DL). Inspired by this, we propose JPEG-DL, a novel DL framework that prepends any underlying DNN architecture with a trainable JPEG compression layer. To make the quantization operation in JPEG compression trainable, a new differentiable soft quantizer is employed at the JPEG layer, and then the quantization operation and underlying DNN are jointly trained. Extensive experiments show that in comparison with the standard DL, JPEG-DL delivers significant accuracy improvements across various datasets and model architectures while enhancing robustness against adversarial attacks. Particularly, on some fine-grained image classification datasets, JPEG-DL can increase prediction accuracy by as much as 20.9%. Our code is available on https://github.com/JpegInspiredDl/JPEG-Inspired-DL.git.
July 2024
·
8 Reads
·
8 Citations
July 2024
·
38 Reads
·
6 Citations
January 2024
·
4 Reads
·
2 Citations
IEEE Journal on Selected Areas in Information Theory
Conventional image compression techniques are primarily developed for the human visual system. However, with the extensive use of deep neural networks (DNNs) for computer vision, more and more images will be consumed by DNN-based intelligent machines, which makes it crucial to develop image compression techniques customized for DNN vision while being JPEG compliant. In this paper, we revisit the JPEG rate distortion theory for DNN vision. First, we propose a novel distortion measure, dubbed the sensitivity weighted error (SWE), for DNN vision. Second, we incorporate SWE into the soft decision quantization (SDQ) process of JPEG to trade SWE for rate. Finally, we develop an algorithm, called OptS, for designing optimal quantization tables for the luminance channel and chrominance channels, respectively. To test the performance of the resulting DNN-oriented compression framework and algorithm, experiments of image classification are conducted on the ImageNet dataset for four prevalent DNN models. Results demonstrate that our proposed framework and algorithm achieve better rate-accuracy (R-A) performance than the default JPEG. For some DNN models, our proposed framework and algorithm provide a significant reduction in the compression rate up to 67.84% with no accuracy loss compared to the default JPEG.
... (2) it has been employed as an empirical defense method against adversarial attacks, effectively reducing adversarial perturbations and enhancing the adversarial robustness of DNNs (Dziugaite et al., 2016;Das et al., 2017;Guo et al., 2018); and (3) it has been integrated into the knowledge distillation (Hinton et al., 2015) framework by Salamah et al. (2024a), where it helps the teacher model transfer knowledge to the student model in a more effective way. In contrast, this paper focuses on leveraging JPEG to enhance the natural performance, instead of the robust performance, of DNNs without relying on any teacher model. ...
Reference:
JPEG Inspired Deep Learning
February 2025
Pattern Recognition
... Ditto proves effective in enhancing testing accuracy among local devices and promoting fairness. Ideas from information theory such as conditional mutual information could also be used to promote fairness in FL (Yang et al., 2023;. ...
July 2024
... This approach significantly reduces communication latency and conserves uplink communication bandwidth. A notable challenge in FL, including OTA-FL, stems from the heterogeneity in the distributions of clients' local datasets [7]. This heterogeneity can lead to poor performance when the global model is applied to individual clients' private datasets, resulting in an unfair global model [8]. ...
July 2024
... Moreover, the adherence to the default JPEG quantization also limits the effectiveness of this method. On the other hand, given a fixed DNN model, Zheng et al. (2023) and Salamah et al. (2024b) found that its performance could be slightly improved if the input images got compressed by JPEG with optimized quantization parameters. However, the performance gain is not significant due to the frozen DNN model. ...
Reference:
JPEG Inspired Deep Learning
January 2024
IEEE Journal on Selected Areas in Information Theory
... Moreover, the adherence to the default JPEG quantization also limits the effectiveness of this method. On the other hand, given a fixed DNN model, Zheng et al. (2023) and Salamah et al. (2024b) found that its performance could be slightly improved if the input images got compressed by JPEG with optimized quantization parameters. However, the performance gain is not significant due to the frozen DNN model. ...
Reference:
JPEG Inspired Deep Learning
October 2023
... The need for space-efficient representations of joint RNA sequence and secondary structure databases has been identified by Liu et al. in 2008 [16]. Their algorithm RNACompress, based on a stochastic context-free grammar (SCFG, defined below), has been recognized as an early application of ideas from grammar-based compression in the data-compression community [17,12]. As we demonstrate in this article, substantially better compression ratios can be achieved than Liu et al. report; interestingly, by carefully extending their very method to a general framework of SCFG-based compression. ...
November 2022
IEEE BITS the Information Theory Magazine
... In recent years, deploying deep neural networks (DNNs) in various fields such as computer vision and neural language processing has gained considerable momentum. Yet, DNNs are vulnerable to adversarial samples crafted by adding small and human-imperceptible adversarial perturbations to normal examples [13,45]. In particular, safety and security focused applications, for instance face recognition [25] and autonomous driving [4], are concerned about robustness to adversarial samples. ...
June 2021
... In fact, compression on the HAM10K dataset seems to act as a regularizer and brings a slight improvement in the task performance. This phenomenon has also been observed in other works (Yang, Amer, and Jiang 2021;Choi and Bajić 2022). ...
July 2021
Entropy
... For example, Qu et al. [2] combined U-Net [30] with JPEG compression layers [3], training an encoding-decoding structure to simulate various post-processing operations performed by OSNs. [34] Feature Perturb DAIR [35] Feature Perturb UAP [36] Feature Perturb HAG [37] Hash Perturb CWDM [4] Hash Perturb DHTA [38] Hash Perturb SDHA [39] Hash Perturb PTA [5] Hash Perturb NAG [40] Hash Perturb AdvHash [6] Hash Patch Class-wise UTAP [7] Hash Perturb TOAP Hash Perturb ...
November 2020
Lecture Notes in Computer Science
... In [32], the optimal Lagrange multiplier for the B frame was predicted using the ratio between the average distortions of the P and B frames. A joint optimization method for identifying a suitable coding mode and Lagrange multiplier for interframe coding was presented in [33]. To accurately reflect the perceived image quality in RDO, numerous studies have incorporated the properties of the human visual system into image coding [34]- [37]. ...
October 2020
IEEE Transactions on Image Processing