Yunhui Shi

Beijing University of Technology, Peping, Beijing, China

Are you Yunhui Shi?

Claim your profile

Publications (47)12.27 Total impact

  • Weijia Zhu · Wenpeng Ding · Jizheng Xu · Yunhui Shi · Baocai Yin
    [Show abstract] [Hide abstract]
    ABSTRACT: By considering the increasing importance of screen contents, the high efficiency video coding (HEVC) standard includes screen content coding as one of its requirements. In this paper, we demonstrate that enabling frame level block searching in HEVC can significantly improve coding efficiency on screen contents. We propose a hash-based block matching scheme for the intra block copy mode and the motion estimation process, which enables frame level block searching in HEVC without changing the HEVC syntaxes. In the proposed scheme, the blocks sharing the same hash values with the current block are selected as prediction candidates. Then the hash-based block selection is employed to select the best candidates. To achieve the best coding efficiency, the rate distortion optimization is further employed to improve the proposed scheme by balancing the coding cost of motion vectors and prediction difference. Compared with HEVC, the proposed scheme achieves 21% and 37% bitrate saving with all intra and low delay configurations with encoding time reduction . Up to 59% bitrate saving can be achieved on sequences with large motions.
    IEEE Transactions on Multimedia 01/2015; 17(7):1-1. DOI:10.1109/TMM.2015.2428171 · 1.78 Impact Factor
  • Weijia Zhu · Wenpeng Ding · Jizheng Xu · Yunhui Shi · Baocai Yin
    [Show abstract] [Hide abstract]
    ABSTRACT: Screen content like cartoons, captures of typical computer screens or video with text overlay or news ticker is an important category of video, which needs new techniques beyond the existing video coding techniques. In this paper, we analyze the characteristics of screen content and coding efficiency of HEVC on screen content. We propose a new coding scheme, which adopts a non-transform representation, separating screen content into color component and structure component. Based on the proposed representation, two coding modes are designed for screen content to exploit the directional correlation and non-translational changes in screen video sequences. The proposed scheme is then seamlessly incorporated into the HEVC structure and implemented into HEVC range extension reference software HM9.0. Experimental results show that the proposed scheme achieves up to 52.6% bitrate saving compared with HM9.0. On average, 35.1%, 29.2% and 23.6% bitrate saving are achieved with intra, random-access and low-delay configurations, respectively. The visual quality of the decoded video sequences is also significantly improved by reducing ringing artifacts around sharp edges and reserving the shape of text without blur.
    IEEE Transactions on Multimedia 08/2014; 16(5):1316-1326. DOI:10.1109/TMM.2014.2315782 · 1.78 Impact Factor
  • Weijia Zhu · Wenpeng Ding · Jizheng Xu · Yunhui Shi · Baocai Yin
    [Show abstract] [Hide abstract]
    ABSTRACT: Screen contents with complex structure contain random combination of texts, graphics and camera-captured images, which makes them difficult to be compressed efficiently by traditional video codecs. In this paper, we propose a 2-D dictionary based scheme to exploit the repeated patterns on screen content. In the proposed scheme, the current block is predicted from the reconstructed region using a hash-based block searching scheme. A hierarchical two-level hash based searching scheme is designed to find the best matching block for each block. The first-level hash function is used to search the blocks similar to the current block in the constructed 2-D dictionary. The second-level hash function is used to update the 2-D dictionary, which filters out the identical blocks from the blocks found using the first-level hash function. The proposed scheme is incorporated into HEVC framework as an additional mode. Experimental results show that the proposed scheme achieves significantly coding performance improvements on screen contents compared with HEVC.
    2014 Data Compression Conference (DCC); 03/2014
  • Jin Wang · Yunhui Shi · Wenpeng Ding · Baocai Yin
    [Show abstract] [Hide abstract]
    ABSTRACT: Linear representation models are effective to represent the correlation in image interpolation. However, linear models usually lack constraints of the representation coefficient. In this paper, we propose a low rank matrix recovery based image interpolation to reinforce the sparsity of representation coefficient implicitly. Since both the local and nonlocal correlation is pervasive in natural images, we exploit such correlations by incorporating the local and nonlocal modeling, which fully utilizes the redundancy in images and improves the representation ability of our model. By minimizing the sum of the rank of data matrices which reflect the linear relationship among local patch pixels and nonlocal similar patch pixels, a precise low rank approximation of the missing pixels is obtained according to the low rank matrix recovery theory. A Split Bregman based minimization algorithm is developed to efficiently solve the low rank recovery problem. Extensive experimental results indicate the proposed method outperforms the traditional methods in both the objective and subjective visual quality.
    2013 Picture Coding Symposium (PCS); 12/2013
  • Weijia Zhu · Jizheng Xu · Wenpeng Ding · Yunhui Shi · Baocai Yin
    [Show abstract] [Hide abstract]
    ABSTRACT: Screen content refers to image/video generated by electronic devices. Existing state-of-art image/video coding standards don't exploit the anisotropic features. This paper proposes an adaptive LZMA-based coding scheme for screen content to achieve better coding efficiency. Adaptive scanning and directional matching are incorporated into LZMA in our scheme. Both of them exploit the directional correlations in screen content to increase the matching probability of LZMA, and thus achieve better coding performance. Experimental results show that the proposed scheme significantly outperforms existing LZMA-based coding schemes on screen content coding.
    2013 Picture Coding Symposium (PCS); 12/2013
  • Na Qi · Yunhui Shi · Xiaoyan Sun · Jingdong Wang · Wenpeng Ding
    [Show abstract] [Hide abstract]
    ABSTRACT: An analysis sparse model represents an image signal by multiplying it using an analysis dictionary, leading to a sparse outcome. It transforms an image (two dimensional signal) into a one-dimensional (1D) vector. However, this 1D model ignores the two dimensional property and breaks the local spatial correlation inside images. In this paper, we propose a two dimensional (2D) analysis sparse model. Our 2D model uses two analysis dictionaries to efficiently exploit the horizontal and vertical features simultaneously. The corresponding sparse coding and dictionary learning algorithm are also presented in this paper. The 2D sparse model is further evaluated for image denoising. Experimental results demonstrate our 2D analysis sparse model outperforms a state-of-the-art 1D analysis model in terms of both denoising ability and memory usage.
    2013 20th IEEE International Conference on Image Processing (ICIP); 09/2013
  • Zhen Zhang · Yunhui Shi · Wenpeng Ding · Baocai Yin
    [Show abstract] [Hide abstract]
    ABSTRACT: Compressive sensing (CS) theory, which has been widely used in magnetic resonance (MR) image processing, indicates that a sparse signal can be reconstructed by the optimization programming process from non-adaptive linear projections. Since MR Images commonly possess a blocky structure and have sparse representations under certain wavelet bases, total variation (TV) and wavelet domain ℓ1 norm regularization are enforced together (TV-wavelet L1 method) to improve the recovery accuracy. However, the components of wavelet coefficients are different: low-frequency components of an image, that carry the main energy of the MR image, perform a decisive impact for reconstruction quality. In this paper, we propose a TV and wavelet L2–L1 model (TVWL2–L1) to measure the low frequency wavelet coefficients with ℓ2 norm and high frequency wavelet coefficients with ℓ1 norm. We present two methods to approach this problem by operator splitting algorithm and proximal gradient algorithm. Experimental results demonstrate that our method can obviously improve the quality of MR image recovery comparing with the original TV-wavelet method.
    Journal of Visual Communication and Image Representation 02/2013; 24(2):187–195. DOI:10.1016/j.jvcir.2012.05.006 · 1.36 Impact Factor
  • Yunhui Shi · Qian Li · Wenpeng Ding · Baocai Yin
    Beijing Gongye Daxue Xuebao / Journal of Beijing University of Technology 01/2013; 39(3).
  • Na Qi · Yunhui Shi · Xiaoyan Sun · Jingdong Wang · Baocai Yin
    [Show abstract] [Hide abstract]
    ABSTRACT: Sparse representation has been proved to be very efficient in machine learning and image processing. Traditional image sparse representation formulates an image into a one dimensional (1D) vector which is then represented by a sparse linear combination of the basis atoms from a dictionary. This 1D representation ignores the local spatial correlation inside one image. In this paper, we propose a two dimensional (2D) sparse model to much efficiently exploit the horizontal and vertical features which are represented by two dictionaries simultaneously. The corresponding sparse coding and dictionary learning algorithm are also presented in this paper. The 2D synthesis model is further evaluated in image denoising. Experimental results demonstrate our 2D synthesis sparse model outperforms the state-of-the-art 1D model in terms of both objective and subjective qualities.
    Multimedia and Expo (ICME), 2013 IEEE International Conference on; 01/2013
  • Zhen Zhang · Yunhui Shi · Baocai Yin
    [Show abstract] [Hide abstract]
    ABSTRACT: Since Magnetic resonance(MR) Images commonly possess a blocky structure and have sparse representations under certain wavelet bases, total variation (TV) and wavelet domain ℓ1 norm regularization are often enforced together (TV-wavelet method) to improve the recovery accuracy. However, this model ignores that a family of wavelet coefficients has a natural grouping of its components. In this paper, we propose a new TV-Group sparse model which combines TV and wavelet domain group sparse penalty. The corresponding algorithm based on composite splitting method is employed to approach this TV-Group sparse model. Experimental results show that our model can obviously improve both objective and subjective qualities of MR image recovery comparing with the TV-wavelet model.
    Multimedia and Expo (ICME), 2013 IEEE International Conference on; 01/2013
  • Baocai Yin · Yunhui Shi · Wenpeng Ding · Yongli Hu · Jinghua Li
    01/2013; 43(2):226. DOI:10.1360/112011-1423
  • Yunhui Shi · Na Qi · Baocai Yin · Wenpeng Ding
    [Show abstract] [Hide abstract]
    ABSTRACT: Analysis sparse model has been successfully used for a variety of tasks such as image denoising, deblurring, and most recently compressed sensing, so it arouses much attention. K-SVD is a mature dictionary learning approach for the analysis sparse model. However, it represents images as one dimension signals, which results in mistakes of spatial correlations. In this paper, we propose a novel analysis sparse model, where analysis dictionary derived from two analysis operators which act on an image, leading to a sparse outcome. And a two dimensional K-SVD (2D-KSVD) is proposed to train the analysis sparse dictionaries. Experiments on image denoising validate that the proposed analysis dictionary can express more image spatial and frequency characteristics and by using the dictionary, the two dimension analysis sparse model outperforms the traditional analysis model in terms of PSNR.
    Proceedings of the 13th Pacific-Rim conference on Advances in Multimedia Information Processing; 12/2012
  • Yunhui Shi · Bo Wen · Wenpeng Ding · Na Qi · Baocai Yin
    [Show abstract] [Hide abstract]
    ABSTRACT: The benefit of using the geometry image to represent an arbitrary 3D mesh is that the 3D mesh can be re-sampled as a completely regular structure and coded efficiently by common image compression methods. For geometry image-based 3D mesh compression, we need to code the normal-map images while coding geometry images to improve the subjective quality and realistic effects of the reconstructed model. In traditional methods, a geometry image and a normal-map image are coded independently. However a strong correlation exists between these two kinds of images, because both of them are generated from the same 3D mesh and share the same parameterization. In this paper we propose a predictive coding framework, in which the normal-map image is predicted based on the geometric correlation between them. Additionally we utilize the strong geometric correlation among three components of normal-map image to improve the predicting accuracy. The experimental results show the proposed coding framework improves the coding efficiency of normal-map image, meanwhile the realistic effect of a 3D mesh is significantly enhanced.
    Multimedia Tools and Applications 06/2012; DOI:10.1007/s11042-012-1231-9 · 1.35 Impact Factor
  • Yunhui Shi · Bo Wen · Wenpeng Ding · Na Qi · Baocai Yin
    [Show abstract] [Hide abstract]
    ABSTRACT: In order to show the realistic 3D mesh in geometry image-based 3D mesh compression, in addition to coding geometry image, normal-map image is usually required to code. But normal-map image are difficult to compress because it captures more details of the original mesh, and it has less spatial correlation between pixels than geometry image. This paper proposes a novel coding framework to solve this problem, we effectively predict the normal-map image based on the correlation between geometry image and normal-map image, and we also utilize the strong correlation among three components of normal-map image to improve the predicting accuracy. In this framework we only need to code geometry image and residual image which generated from normal-map image and its prediction. Experimental results show that comparing with the method which coding geometry image and normal-map image using JPEG2000 directly, our coding framework not only improves the coding efficiency of geometry images and normal-map images, but also enhances the realistic effect of 3D mesh significantly.
  • Source
    Yunhui Shi · He Li · Jin Wang · Wenpeng Ding · Baocai Yin
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes a new method of inter prediction based on low-rank matrix completion. By collection and rearrangement, image regions with high correlations can be used to generate a low-rank or approximately low-rank matrix. We view prediction values as the missing part in an incomplete low-rank matrix, and obtain the prediction by recovering the generated low-rank matrix. Taking advantage of exact recovery of incomplete matrix, the low-rank based prediction can exploit temporal correlation better. Our proposed prediction has the advantage of higher accuracy and less extra information, as the motion vector doesn't need to be encoded. Simulation results show that the bit-rate saving of the proposed scheme can reach up to 9.91% compared with H.264/AVC. Our scheme also outperforms the counterpart of the Template Matching Averaging (TMA) prediction by 8.06% at most.
    Multimedia and Expo Workshops (ICMEW), 2012 IEEE International Conference on; 01/2012
  • [Show abstract] [Hide abstract]
    ABSTRACT: Mode dependent directional transform (MDDT) can improve the coding efficiency of H.264/AVC but it also brings high computation complexity. In this paper we present a new design for implementing fast MDDT transform through integer lifting steps. We first approximate the optimal MDDT by a proper transform matrix that can be implemented with butterfly-style operation. We further factorize the butterfly-style transform into a series of integer lifting steps to eliminate the need of multiplications. Experimental results show that the proposed fast MDDT can significantly reduce the computation complexity while introducing negligible loss in the coding efficiency. Due to the merit of integer lifting steps, the proposed fast MDDT is reversible and can be implemented on hardware very easily.
    Journal of Visual Communication and Image Representation 11/2011; 22(8):721-726. DOI:10.1016/j.jvcir.2011.01.007 · 1.36 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we give an analytical model of the compression error of down-sampled compression based on wavelet transform, which explains why down-sampling before compression can improve coding performance. And we approximate the missing details due to down-sampling and compression by using the linear combination of a set of basis vectors with L1 norm. Then we propose a down-sampled and high frequency information approximated coding scheme and apply it to natural images, and achieve gains of both subjective quality and objective quality compared with JPEG2000.
    Journal of Computational and Applied Mathematics 10/2011; 236:675-683. DOI:10.1016/j.cam.2011.06.025 · 1.08 Impact Factor
  • Zhen Zhang · Yunhui Shi · Dehui Kong · Wenpeng Ding · Baocai Yin
    [Show abstract] [Hide abstract]
    ABSTRACT: Transform-based image codec follows the basic principle: the reconstructed quality is decided by the quantization level. Compressive sensing (CS) breaks the limit and states that sparse signals can be perfectly recovered from incomplete or even corrupted information by solving convex optimization. Under the same acquisition of images, if images are represented sparsely enough, they can be reconstructed more accurately by CS recovery than inverse transform. So, in this paper, we utilize a modified TV operator to enhance image sparse representation and reconstruction accuracy, and we acquire image information from transform coefficients corrupted by quantization noise. We can reconstruct the images by CS recovery instead of inverse transform. A CS-based JPEG decoding scheme is obtained and experimental results demonstrate that the proposed methods significantly improve the PSNR and visual quality of reconstructed images compared with original JPEG decoder.
    Journal of Computational and Applied Mathematics 10/2011; 236:812-818. DOI:10.1016/j.cam.2011.05.023 · 1.08 Impact Factor
  • Jingyan Shang · Wenpeng Ding · Yunhui Shi · Yanfeng Sun
    [Show abstract] [Hide abstract]
    ABSTRACT: The latest video coding standard H.264/AVC outperforms previous standards in terms of coding efficiency at cost of higher runtime complexity. When RDO is used, the most time-consuming process in a H.264/AVC encoder is mode decision, where all the intra/inter modes are tested to find the optimal coding mode. In this paper, we present a fast intra mode decision scheme, which first detects the texture direction and only tests a subset of intra modes consistent with detected direction. Experimental results demonstrate that the proposed scheme significantly reduces the overall encoding time with negligible coding performance loss.
    Proceedings of the 2011 Third International Workshop on Education Technology and Computer Science - Volume 01; 03/2011
  • Zhen Zhang · Yunhui Shi · Baocai Yin
    [Show abstract] [Hide abstract]
    ABSTRACT: Existing image codec technologies are based on transform which make image signal can be compressed, while quantization has been used to control bit rates. Compressive sensing (CS), which is a novel signal processing and recovery method, can be applied to image decoding to replace inverse transform reconstruction. This paper proposes an error estimate method based on equalization quantization noise model for image codec. Due to the robust character of CS, it can upgrade the quality of reconstruction when error has been estimated accurately. With designed equalization matrix, a new norm constraint which can enhance the quality of CS recovery significantly has been shown. A CS-based JPEG decoding scheme based on quantization error estimate is also presented, and experimental evidence exhibits more gains over CS reconstruction without error estimation and original JPEG decoder.