En-hui Yang’s research while affiliated with University of Waterloo and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (230)


A coded knowledge distillation framework for image classification based on adaptive JPEG encoding
  • Article

February 2025

·

1 Read

·

1 Citation

Pattern Recognition

Ahmed H. Salamah

·

·

En-Hui Yang

Fig. 1. Illustration of the partial derivatives of Q d (θ) w.r.t. θ (left), and q (right) for α = {100, 300, 500, 700}, where b and q equal 3 and 0.1, respectively.
Fig. 3. Illustration of the CDL's mechanism.
Fig. 6. The evolution of weight distribution of a convolutional layer in ResNet-110 during training via CDL on CIFAR-100.
Coded Deep Learning: Framework and Algorithm
  • Preprint
  • File available

January 2025

·

1 Read

The success of deep learning (DL) is often achieved with large models and high complexity during both training and post-training inferences, hindering training in resource-limited settings. To alleviate these issues, this paper introduces a new framework dubbed ``coded deep learning'' (CDL), which integrates information-theoretic coding concepts into the inner workings of DL, to significantly compress model weights and activations, reduce computational complexity at both training and post-training inference stages, and enable efficient model/data parallelism. Specifically, within CDL, (i) we first propose a novel probabilistic method for quantizing both model weights and activations, and its soft differentiable variant which offers an analytic formula for gradient calculation during training; (ii) both the forward and backward passes during training are executed over quantized weights and activations, eliminating most floating-point operations and reducing training complexity; (iii) during training, both weights and activations are entropy constrained so that they are compressible in an information-theoretic sense throughout training, thus reducing communication costs in model/data parallelism; and (iv) the trained model in CDL is by default in a quantized format with compressible quantized weights, reducing post-training inference and storage complexity. Additionally, a variant of CDL, namely relaxed CDL (R-CDL), is presented to further improve the trade-off between validation accuracy and compression though requiring full precision in training with other advantageous features of CDL intact. Extensive empirical results show that CDL and R-CDL outperform the state-of-the-art algorithms in DNN compression in the literature.

Download

Conditional Mutual Information Based Diffusion Posterior Sampling for Solving Inverse Problems

January 2025

·

6 Reads

Inverse problems are prevalent across various disciplines in science and engineering. In the field of computer vision, tasks such as inpainting, deblurring, and super-resolution are commonly formulated as inverse problems. Recently, diffusion models (DMs) have emerged as a promising approach for addressing noisy linear inverse problems, offering effective solutions without requiring additional task-specific training. Specifically, with the prior provided by DMs, one can sample from the posterior by finding the likelihood. Since the likelihood is intractable, it is often approximated in the literature. However, this approximation compromises the quality of the generated images. To overcome this limitation and improve the effectiveness of DMs in solving inverse problems, we propose an information-theoretic approach. Specifically, we maximize the conditional mutual information I(x0;yxt)\mathrm{I}(\boldsymbol{x}_0; \boldsymbol{y} | \boldsymbol{x}_t), where x0\boldsymbol{x}_0 represents the reconstructed signal, y\boldsymbol{y} is the measurement, and xt\boldsymbol{x}_t is the intermediate signal at stage t. This ensures that the intermediate signals xt\boldsymbol{x}_t are generated in a way that the final reconstructed signal x0\boldsymbol{x}_0 retains as much information as possible about the measurement y\boldsymbol{y}. We demonstrate that this method can be seamlessly integrated with recent approaches and, once incorporated, enhances their performance both qualitatively and quantitatively.


Figure 2: Visualizing the first two dimensions of the estimated posterior distributions for the configuration (d = 80, m = 1, σ = 10 −1 ) for a randomly generated A.
Figure 3: Qualitative results on FFHQ dataset.
Figure 4: Qualitative results on FFHQ dataset.
Enhancing Diffusion Models for Inverse Problems with Covariance-Aware Posterior Sampling

December 2024

·

2 Reads

Inverse problems exist in many disciplines of science and engineering. In computer vision, for example, tasks such as inpainting, deblurring, and super resolution can be effectively modeled as inverse problems. Recently, denoising diffusion probabilistic models (DDPMs) are shown to provide a promising solution to noisy linear inverse problems without the need for additional task specific training. Specifically, with the prior provided by DDPMs, one can sample from the posterior by approximating the likelihood. In the literature, approximations of the likelihood are often based on the mean of conditional densities of the reverse process, which can be obtained using Tweedie formula. To obtain a better approximation to the likelihood, in this paper we first derive a closed form formula for the covariance of the reverse process. Then, we propose a method based on finite difference method to approximate this covariance such that it can be readily obtained from the existing pretrained DDPMs, thereby not increasing the complexity compared to existing approaches. Finally, based on the mean and approximated covariance of the reverse process, we present a new approximation to the likelihood. We refer to this method as covariance-aware diffusion posterior sampling (CA-DPS). Experimental results show that CA-DPS significantly improves reconstruction performance without requiring hyperparameter tuning. The code for the paper is put in the supplementary materials.


Going Beyond Feature Similarity: Effective Dataset distillation based on Class-aware Conditional Mutual Information

December 2024

·

3 Reads

Xinhao Zhong

·

·

Hao Fang

·

[...]

·

En-Hui Yang

Dataset distillation (DD) aims to minimize the time and memory consumption needed for training deep neural networks on large datasets, by creating a smaller synthetic dataset that has similar performance to that of the full real dataset. However, current dataset distillation methods often result in synthetic datasets that are excessively difficult for networks to learn from, due to the compression of a substantial amount of information from the original data through metrics measuring feature similarity, e,g., distribution matching (DM). In this work, we introduce conditional mutual information (CMI) to assess the class-aware complexity of a dataset and propose a novel method by minimizing CMI. Specifically, we minimize the distillation loss while constraining the class-aware complexity of the synthetic dataset by minimizing its empirical CMI from the feature space of pre-trained networks, simultaneously. Conducting on a thorough set of experiments, we show that our method can serve as a general regularization method to existing DD methods and improve the performance and training efficiency.



JPEG Inspired Deep Learning

October 2024

·

6 Reads

Although it is traditionally believed that lossy image compression, such as JPEG compression, has a negative impact on the performance of deep neural networks (DNNs), it is shown by recent works that well-crafted JPEG compression can actually improve the performance of deep learning (DL). Inspired by this, we propose JPEG-DL, a novel DL framework that prepends any underlying DNN architecture with a trainable JPEG compression layer. To make the quantization operation in JPEG compression trainable, a new differentiable soft quantizer is employed at the JPEG layer, and then the quantization operation and underlying DNN are jointly trained. Extensive experiments show that in comparison with the standard DL, JPEG-DL delivers significant accuracy improvements across various datasets and model architectures while enhancing robustness against adversarial attacks. Particularly, on some fine-grained image classification datasets, JPEG-DL can increase prediction accuracy by as much as 20.9%. Our code is available on https://github.com/JpegInspiredDl/JPEG-Inspired-DL.git.




JPEG Compliant Compression for DNN Vision

January 2024

·

4 Reads

·

2 Citations

IEEE Journal on Selected Areas in Information Theory

Conventional image compression techniques are primarily developed for the human visual system. However, with the extensive use of deep neural networks (DNNs) for computer vision, more and more images will be consumed by DNN-based intelligent machines, which makes it crucial to develop image compression techniques customized for DNN vision while being JPEG compliant. In this paper, we revisit the JPEG rate distortion theory for DNN vision. First, we propose a novel distortion measure, dubbed the sensitivity weighted error (SWE), for DNN vision. Second, we incorporate SWE into the soft decision quantization (SDQ) process of JPEG to trade SWE for rate. Finally, we develop an algorithm, called OptS, for designing optimal quantization tables for the luminance channel and chrominance channels, respectively. To test the performance of the resulting DNN-oriented compression framework and algorithm, experiments of image classification are conducted on the ImageNet dataset for four prevalent DNN models. Results demonstrate that our proposed framework and algorithm achieve better rate-accuracy (R-A) performance than the default JPEG. For some DNN models, our proposed framework and algorithm provide a significant reduction in the compression rate up to 67.84% with no accuracy loss compared to the default JPEG.


Citations (53)


... (2) it has been employed as an empirical defense method against adversarial attacks, effectively reducing adversarial perturbations and enhancing the adversarial robustness of DNNs (Dziugaite et al., 2016;Das et al., 2017;Guo et al., 2018); and (3) it has been integrated into the knowledge distillation (Hinton et al., 2015) framework by Salamah et al. (2024a), where it helps the teacher model transfer knowledge to the student model in a more effective way. In contrast, this paper focuses on leveraging JPEG to enhance the natural performance, instead of the robust performance, of DNNs without relying on any teacher model. ...

Reference:

JPEG Inspired Deep Learning
A coded knowledge distillation framework for image classification based on adaptive JPEG encoding
  • Citing Article
  • February 2025

Pattern Recognition

... Ditto proves effective in enhancing testing accuracy among local devices and promoting fairness. Ideas from information theory such as conditional mutual information could also be used to promote fairness in FL (Yang et al., 2023;. ...

Conditional Mutual Information Constrained Deep Learning: Framework and Preliminary Results

... This approach significantly reduces communication latency and conserves uplink communication bandwidth. A notable challenge in FL, including OTA-FL, stems from the heterogeneity in the distributions of clients' local datasets [7]. This heterogeneity can lead to poor performance when the global model is applied to individual clients' private datasets, resulting in an unfair global model [8]. ...

Fed-IT: Addressing Class Imbalance in Federated Learning through an Information- Theoretic Lens
  • Citing Conference Paper
  • July 2024

... Moreover, the adherence to the default JPEG quantization also limits the effectiveness of this method. On the other hand, given a fixed DNN model, Zheng et al. (2023) and Salamah et al. (2024b) found that its performance could be slightly improved if the input images got compressed by JPEG with optimized quantization parameters. However, the performance gain is not significant due to the frozen DNN model. ...

JPEG Compliant Compression for DNN Vision
  • Citing Article
  • January 2024

IEEE Journal on Selected Areas in Information Theory

... Moreover, the adherence to the default JPEG quantization also limits the effectiveness of this method. On the other hand, given a fixed DNN model, Zheng et al. (2023) and Salamah et al. (2024b) found that its performance could be slightly improved if the input images got compressed by JPEG with optimized quantization parameters. However, the performance gain is not significant due to the frozen DNN model. ...

JPEG Compliant Compression for DNN Vision
  • Citing Conference Paper
  • October 2023

... The need for space-efficient representations of joint RNA sequence and secondary structure databases has been identified by Liu et al. in 2008 [16]. Their algorithm RNACompress, based on a stochastic context-free grammar (SCFG, defined below), has been recognized as an early application of ideas from grammar-based compression in the data-compression community [17,12]. As we demonstrate in this article, substantially better compression ratios can be achieved than Liu et al. report; interestingly, by carefully extending their very method to a general framework of SCFG-based compression. ...

Survey of Grammar-Based Data Structure Compression
  • Citing Article
  • November 2022

IEEE BITS the Information Theory Magazine

... In recent years, deploying deep neural networks (DNNs) in various fields such as computer vision and neural language processing has gained considerable momentum. Yet, DNNs are vulnerable to adversarial samples crafted by adding small and human-imperceptible adversarial perturbations to normal examples [13,45]. In particular, safety and security focused applications, for instance face recognition [25] and autonomous driving [4], are concerned about robustness to adversarial samples. ...

A Watermarking-Based Framework for Protecting Deep Image Classifiers Against Adversarial Attacks
  • Citing Conference Paper
  • June 2021

... For example, Qu et al. [2] combined U-Net [30] with JPEG compression layers [3], training an encoding-decoding structure to simulate various post-processing operations performed by OSNs. [34] Feature Perturb DAIR [35] Feature Perturb UAP [36] Feature Perturb HAG [37] Hash Perturb CWDM [4] Hash Perturb DHTA [38] Hash Perturb SDHA [39] Hash Perturb PTA [5] Hash Perturb NAG [40] Hash Perturb AdvHash [6] Hash Patch Class-wise UTAP [7] Hash Perturb TOAP Hash Perturb ...

Targeted Attack for Deep Hashing Based Retrieval
  • Citing Conference Paper
  • November 2020

Lecture Notes in Computer Science

... In [32], the optimal Lagrange multiplier for the B frame was predicted using the ratio between the average distortions of the P and B frames. A joint optimization method for identifying a suitable coding mode and Lagrange multiplier for interframe coding was presented in [33]. To accurately reflect the perceived image quality in RDO, numerous studies have incorporated the properties of the human visual system into image coding [34]- [37]. ...

Rate Distortion Optimization: A Joint Framework and Algorithms for Random Access Hierarchical Video Coding
  • Citing Article
  • October 2020

IEEE Transactions on Image Processing