Yuzhi Zhao

Yuzhi Zhao
City University of Hong Kong | CityU · Department of Electronic Engineering

Bachelor of Engineering

About

36
Publications
4,403
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
360
Citations
Introduction
Yuzhi Zhao receives his BEng degree at the School of Electronic Information and Communication (电信卓越班), Huazhong University of Science an Technology (HUST). He currently works as a PhD student at the Department of Electronic Engineering, City University of Hong Kong.

Publications

Publications (36)
Preprint
Night imaging with modern smartphone cameras is troublesome due to low photon count and unavoidable noise in the imaging system. Directly adjusting exposure time and ISO ratings cannot obtain sharp and noise-free images at the same time in low-light conditions. Though many methods have been proposed to enhance noisy or blurry night images, their pe...
Article
Spatio-temporal representation learning is critical for video self-supervised representation. Recent approaches mainly use contrastive learning and pretext tasks. However, these approaches learn representation by discriminating sampled instances via feature similarity in the latent space while ignoring the intermediate state of the learned represen...
Preprint
The appearances of children are inherited from their parents, which makes it feasible to predict them. Predicting realistic children's faces may help settle many social problems, such as age-invariant face recognition, kinship verification, and missing child identification. It can be regarded as an image-to-image translation task. Existing approach...
Article
In this paper, we present a novel end-to-end pose transfer framework to transform a source person image to an arbitrary pose with controllable attributes. Due to the spatial misalignment caused by occlusions and multi-viewpoints, maintaining high-quality shape and texture appearance is still a challenging problem for pose-guided person image synthe...
Article
To leverage the strong cross-frame relations of videos, many video semantic segmentation methods tend to explore feature reuse and feature warping based on motion clues. However, since the video dynamics are too complex to model accurately, some warped feature values may be invalid. Moreover, the warping errors can accumulate across frames, thereby...
Preprint
Spatio-temporal representation learning is critical for video self-supervised representation. Recent approaches mainly use contrastive learning and pretext tasks. However, these approaches learn representation by discriminating sampled instances via feature similarity in the latent space while ignoring the intermediate state of the learned represen...
Article
With the rapid development of the convolutional neural network, both instance segmentation and semantic segmentation have achieved remarkable performances. Recently, many efforts have been made to use a unified Encoder-Decoder architecture to solve these two segmentation tasks simultaneously. The encoder extracts high-level features from the input...
Article
Human parsing has drawn a lot of attention from the public due to its critical role in high-level computer vision applications. Recent works demonstrated the effectiveness of utilizing context module and additional information in improving the performance of human parsing. However, ambiguous objects, small scaling and occlusion problems are still t...
Preprint
We propose a hybrid recurrent Video Colorization with Hybrid Generative Adversarial Network (VCGAN), an improved approach to video colorization using end-to-end learning. The VCGAN addresses two prevalent issues in the video colorization domain: Temporal consistency and unification of colorization network and refinement network into a single archit...
Preprint
Full-text available
Due to unreliable geometric matching and content misalignment, most conventional pose transfer algorithms fail to generate fine-trained person images. In this paper, we propose a novel framework Spatial Content Alignment GAN (SCAGAN) which aims to enhance the content consistency of garment textures and the details of human characteristics. We first...
Article
Full-text available
Deep reinforcement learning (DRL) has been utilized in numerous computer vision tasks, such as object detection, autonomous driving, etc. However, relatively few DRL methods have been proposed in the area of image segmentation, particularly in left ventricle segmentation. Reinforcement learning-based methods in earlier works often rely on learning...
Chapter
Despite convolutional network-based methods have boosted the performance of single image super-resolution (SISR), the huge computation costs restrict their practical applicability. In this paper, we develop a computation efficient yet accurate network based on the proposed attentive auxiliary features (A\(^2\)F) for SISR. Firstly, to explore the fe...
Preprint
Given a grayscale photograph, the colorization system estimates a visually plausible colorful image. Conventional methods often use semantics to colorize grayscale images. However, in these methods, only classification semantic information is embedded, resulting in semantic confusion and color bleeding in the final colorized image. To address these...
Preprint
Full-text available
There are quite a number of photographs captured under undesirable conditions in the last century. Thus, they are often noisy, regionally incomplete, and grayscale formatted. Conventional approaches mainly focus on one point so that those restoration results are not perceptually sharp or clean enough. To solve these problems, we propose a noise pri...
Article
Given a grayscale photograph, the colorization system estimates a visually plausible colorful image. Conventional methods often use semantics to colorize grayscale images. However, in these methods, only classification semantic information is embedded, resulting in semantic confusion and color bleeding in the final colorized image. To address these...
Preprint
Full-text available
Despite convolutional network-based methods have boosted the performance of single image super-resolution (SISR), the huge computation costs restrict their practical applicability. In this paper, we develop a computation efficient yet accurate network based on the proposed attentive auxiliary features (A$^2$F) for SISR. Firstly, to explore the feat...
Preprint
Full-text available
This paper reviews the AIM 2020 challenge on efficient single image super-resolution with focus on the proposed solutions and results. The challenge task was to super-resolve an input image with a magnification factor x4 based on a set of prior examples of low and corresponding high resolution images. The goal is to devise a network that reduces on...
Article
Data augmentation is critical for deep learning-based human activity recognition (HAR) systems. However, conventional data augmentation methods, such as random-cropping, may generate bad samples that are unrelated to a particular activity (e.g. the background patches without saliency motion information). As a result, the random-cropping based data...
Preprint
Full-text available
Capturing visual image with a hyperspectral camera has been successfully applied to many areas due to its narrow-band imaging technology. Hyperspectral reconstruction from RGB images denotes a reverse process of hyperspectral imaging by discovering an inverse response function. Current works mainly map RGB images directly to corresponding spectrum...
Preprint
This paper reviews the NTIRE 2020 challenge on real image denoising with focus on the newly introduced dataset, the proposed methods and their results. The challenge is a new version of the previous NTIRE 2019 challenge on real image denoising that was based on the SIDD benchmark. This challenge is based on a newly collected validation and testing...
Article
Full-text available
Deep Learning based image quality assessment (IQA) has been shown to greatly improve the quality score prediction accuracy of images with single distortion. However, because these models lack generalizability and the accuracy of multidistortion-based image data is relatively low, designing reliable IQA systems is still an open issue. In this paper,...
Chapter
This paper reviews the AIM 2020 challenge on efficient single image super-resolution with focus on the proposed solutions and results. The challenge task was to super-resolve an input image with a magnification factor \(\times \)4 based on a set of prior examples of low and corresponding high resolution images. The goal is to devise a network that...
Conference Paper
Full-text available
This paper reviews the first AIM challenge on mapping camera RAW to RGB images with the focus on proposed solutions and results. The participating teams were solving a real-world photo enhancement problem, where the goal was to map the original low-quality RAW images from the Huawei P20 device to the same photos captured with the Canon 5D DSLR came...
Conference Paper
RAW files are widely applied in cameras and scanners as storage because they contain original optical data. Different cameras usually process the RAW files using diverse algorithms that are incompatible. To address the issue, we propose a general transformation method for cross-camera RAW to RGB mapping based on Generative Adversarial Network (GAN)...
Article
Full-text available
Deep learning based image hashing methods learn hash codes by using powerful feature extractors and nonlinear transformations to achieve highly efficient image retrieval. For most end-to-end deep hashing methods, the supervised learning process relies on pair-wise or triplet-wise information to provide an internal relationship of similarity data. H...
Article
Conventionally, classifiers designed for face liveness detection are trained on real-world images, where real-face images and corresponding face presentation attacks (PA) are very much overlapped. However, a little research has been carried out in utilization of the combination of real-world face images and face images generated by deep convolution...

Network

Cited By

Projects

Project (1)
Project
Distinguish the human face in front of the camera is real or fake