Pao-Chi Chang

National Central University, Taoyuan City, Taiwan, Taiwan

Are you Pao-Chi Chang?

Claim your profile

Publications (80)31.56 Total impact

  • Source
    Ren-Jie Wang, Chih-Wei Huang, Pao-Chi Chang
    [Show abstract] [Hide abstract]
    ABSTRACT: Downsampling video coding, whereby downsampled frames are encoded, provides improved perceptual quality in rate-constrained situations. This method shows considerable advantages over other approaches, particularly in wide-spreading high-definition video formats. This paper provides a comprehensive analysis of downsampling video coding. The study proposes a spatially scalable rate-distortion (RD) model, comprising quantization-distortion and quantization-rate models, and develops an optimal encoding frame size determination framework. The proposed method achieves a gain up to 2.3 dB peak signal-to-noise ratio (PSNR) at 1 Mb/s when compared with conventional full frame size coding. The RD performance is close to the optimal scenario, in which the ideal frame size is obtained by heuristically performing downsampling coding in various allowable sizes.
    IEEE Transactions on Circuits and Systems for Video Technology 11/2014; 24(11):1957-1968. DOI:10.1109/TCSVT.2014.2302519 · 2.26 Impact Factor
  • Journal of Electronic Imaging 11/2014; 23(6):061105. DOI:10.1117/1.JEI.23.6.061105 · 0.85 Impact Factor
  • Ren-Jie Wang, Ya-Ting Yang, Pao-Chi Chang
    [Show abstract] [Hide abstract]
    ABSTRACT: Efficient multimedia retrieval has become a vital issue because more audio and video data are now available. This paper focuses on content-based image retrieval (CBIR) in the compression domain (CPD). The retrieval features are extracted based on I-frame coding information in H.264. This paper proposes using a local mode histogram as the texture feature to match images and applying the residual coefficients to filter non-confident modes. The geometrical correspondence between two images is also considered. The experimental results show that the proposed method can substantially reduce computational and memory resource consumption, and provides similar performance compared with methods that extract features from decompressed images.
    Journal of Visual Communication and Image Representation 07/2014; 25(5). DOI:10.1016/j.jvcir.2014.02.016 · 1.36 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: H.264 scalable extension (H.264/SVC) is the current state-of-the-art standard of the scalable video coding. Its interlayer prediction provides higher coding efficiency than previous standards. Since the standard was proposed, several attempts have been made to improve the performance based on its coding structure. Quantization-distortion (Q-D) modeling is a fundamental issue in video coding; therefore, this paper proposes new Q-D models for three interlayer predictions in 264/SVC spatial scalability, that is, interlayer motion prediction, intraprediction, and residual prediction. An existing single layer offline Q-D model is extended to H.264/SVC spatial scalable coding. In the proposed method, the residual power from the interlayer prediction is decomposed into the coding distortion and the prediction distortion. The prediction distortion is the mean square error (MSE) between two original signals that can be obtained by preprocessing with low complexity. Therefore, the coding distortion can be estimated based on both the quantization parameter (QP) and a precalculated prediction distortion before the encoding process. Consequently, the estimated quality based on the proposed models achieved a high accuracy of over 90% for the three interlayer predictions in average.
    IEEE Transactions on Broadcasting 06/2014; 60(2):413-419. DOI:10.1109/TBC.2014.2307486 · 2.65 Impact Factor
  • 04/2014; 5(2):81-93. DOI:10.5121/sipij.2014.5208
  • Fourth International conference on Computer Science & Information Technology; 02/2014
  • Ren-Jie Wang, Chih-Wei Huang, Pao-Chi Chang
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we present the efficiency analysis of two sophisticated coding tools, multi-reference frame (MRF) and variable block size (VBS) that cost high computational complexity, in various spatial resolutions. The relationship between coding efficiency and spatial resolution is theoretically discovered. Based on the conclusion of the efficiency analysis that the efficiency improvement from sophisticated coding tools are gradually decreased in higher resolution, we propose an adaptive coding configuration for the encoding with various resolutions that yield significant complexity reduction with negligible RD performance decrease.
    Consumer Electronics (ISCE), 2013 IEEE 17th International Symposium on; 01/2013
  • Yueh-Chuan Lu, Zong-Yi Chen, Pao-Chi Chang
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes a low power multi-Lane Mobile Industry Processor Interface (MIPI) Camera Serial Interface 2 (CSI-2) receiver architecture which adopts an 8-Byte parallel CSI protocol layer for hardware implementations. The proposed scheme can work in environment with 4 data Lanes and 1 Gb/s per data Lane, i.e. with maximum data rate 4 Gb/s, at 62.5 MHz which increases logic operations from 8 ns (125 MHz) to 16 ns (62.5 MHz) without throughput degradation. Therefore, the supply voltage (1.2 V) can be reduced and the power consumption can also be reduced. The proposed architecture is implemented by 0.13 μm CMOS technology and the total gate count is 32.7 K. It not only reduces the operating clock rate but also reduces more than 37%~43% logic power consumption measured in chip.
    Consumer Electronics (ISCE), 2013 IEEE 17th International Symposium on; 01/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: Rate control is critical to time sensitive video applications over networks. However, the H.264/AVC standard takes no particularly response to the scene change which causes transmission quality deterioration significantly. In this work, we propose a robust mechanism of the rate control that can quickly respond to a scene change. In our proposed mechanism, it first allocates the remaining frames as a transition GOP. Then, according to the buffer fullness, it estimates the target bits so that a QP (quantization parameter) value can be determined. Simulation results show that our proposed method improves the average PSNR (peak signal noise ratio) about 1.1 dB with less buffer size, compared with the performance of JM in version 17.2.
    Consumer Electronics (ISCE), 2013 IEEE 17th International Symposium on; 01/2013
  • Chien-Hao Kuo, Shih-Wei Sun, Pao-Chi Chang
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we propose a pairwise curve matching scheme in a multi-camera environment to handle the mis-tracking issue caused by occlusion problem happened in a single camera. According to the skeleton/joints of a human subject analyzed from a depth camera (e.g., Kinect), based the foot points (joints) used for people tracking in a field of view, we apply homography transformation to project the foot points from different views to a virtual bird's eye view, using Kalman filter to achieve people tracking with a pairwise curve matching. The contribution of this paper is trifold: (a) the proposed pair-wise curve matching scheme can handle the occlusion problem happened in one of the cameras, (b) the complexity of the proposed scheme is low and affordable to be implemented in a realtime application, and (c) the implementation on a Kinect camera can provide satisfactory tracking results in a bright or extremely dark environment due to the skeletons/joints analyzed by the coded structured light-based infrared (IR) sensor.
    Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2013 Asia-Pacific; 01/2013
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Video quality under rate constraint is mainly controlled by the frame rate and the quantization parameter. This work proposes a mechanism to obtain the optimal frame rate that maximizes video quality under rate constraint. Based on an objective metric of video quality that can reflect subjective quality, this work first proposes a video quality—frame rate—rate constraint model. Second, the relationship between model parameters and video characteristics is formulized. Finally, this work proposes an efficient frame rate optimization mechanism. Experimental results show that the optimal frame rate estimated by our mechanism is identical to the actual optimal frame rate under most bit rate constraints for both training sequences and new test sequences. In addition, the quality loss caused by the estimation error is generally limited within 0.8 dB in our experiments.
    IEEE Transactions on Broadcasting 06/2012; 58(2):200-208. DOI:10.1109/TBC.2011.2182550 · 2.65 Impact Factor
  • Source
    Hua-Chang Chung, Zong-Yi Chen, Pao-Chi Chang
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes a low power deblocking filter (DF) architecture with Horizontal Edge Skip Processing Architecture (HESPA) scheme that offers an intelligent edge skip aware mechanism in filtering the horizontal edges by adopting a four-stage pipeline and adaptive hybrid filtering order to boost the speed of DF process. The proposed architecture not only reduces more than 34% logic power consumption measured in FPGA but also saves the filtering processes down to 100 clock cycles per macroblock (MB). The system throughput can easily support 1080HD video format at 30 fps with 70MHz clock frequency for low power and high definition video applications. It is implemented on 0.18μm standardized cell library, which consumes only 19.8K gates at a clock frequency of 200 MHz.
    Consumer Electronics (ICCE), 2011 IEEE International Conference on; 02/2011
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Scalable Video Coding (SVC) provides an efficient compression for the video bitstream equipped with various scalable configurations. H.264 scalable extension (H.264/SVC) is the most recent scalable coding standard. It involves the state-of-the-art inter-layer prediction to provide higher coding efficiency than previous standards. Moreover, the requirements for the video quality on distinct situations like link conditions or video contents are usually different. Therefore, it is very desirable to be able to construct a model so that the target quality can be estimated in advance. This work proposes a Quantization-Distortion (Q-D) model for H.264/SVC spatial scalability, and then we can estimate video quality before the actual encoding is performed. In particular, we further decompose the residual from the inter-layer residual prediction into the previous distortion and Prior-Residual so that the residual can be estimated. In simulations, based on the proposed model, we estimate the actual Q-D curves, and its average accuracy is 88.79%.
    Advances in Image and Video Technology - 5th Pacific Rim Symposium, PSIVT 2011, Gwangju, South Korea, November 20-23, 2011, Proceedings, Part II; 01/2011
  • Ming-Chen Chien, Pao-Chi Chang
    [Show abstract] [Hide abstract]
    ABSTRACT: Real-Time Complexity Control for H.264 Video Encoding by Coding Gain Maximization
    IEICE Transactions on Communications 01/2011; 94-B:2181-2184. DOI:10.1587/transcom.E94.B.2181 · 0.33 Impact Factor
  • Source
    Zong-Yi Chen, Jhe-Wei Syu, Pao-Chi Chang
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes a fast inter-layer motion estimation algorithm on spatial scalability for scalable video coding extension of H.264/AVC. In the enhancement layer motion estimation, we utilize the relation between two motion vector predictors from the base layer and the enhancement layer respectively to reduce the number of searches. Additionally, we utilize the mode correlations of temporal direction motion estimation to save more encoding time. The simulation results show that the proposed algorithm can save the computation time up to 67.4% compared with JSVM9.12 with less than 0.0476dB video quality degradation.
    Multimedia and Expo (ICME), 2010 IEEE International Conference on; 08/2010
  • Source
    Ren-Jie Wang, Ming-Chen Chien, Pao-Chi Chang
    [Show abstract] [Hide abstract]
    ABSTRACT: Down-sampling coding, which sub-samples the image and encodes the smaller sized images, is one of the solutions to raise the image quality at insufficiently high rates. In this work, we propose an Adaptive Down-Sampling (ADS) coding for H.264/AVC. The overall system distortion can be analyzed as the sum of the down-sampling distortion and the coding distortion. The down-sampling distortion is mainly the loss of the high frequency components that is highly dependent of the spatial difference. The coding distortion can be derived from the classical Rate-Distortion theory. For a given rate and a video sequence, the optimum down-sampling resolution-ratio can be derived by utilizing the optimum theory toward minimizing the system distortion based on the models of the two distortions. This optimal resolution-ratio is used in both down-sampling and up-sampling processes in ADS coding scheme. As a result, the rate-distortion performance of ADS coding is always higher than the fixed ratio coding or H.264/AVC by 2 to 4 dB at low to medium rates.
    Proceedings of SPIE - The International Society for Optical Engineering 02/2010; DOI:10.1117/12.840257 · 0.20 Impact Factor
  • Source
    Zong-Yi Chen, Tien-Hsu Lee, Pao-Chi Chang
    [Show abstract] [Hide abstract]
    ABSTRACT: Systematic Lossy Error Protection (SLEP) is a robust error resilient mechanism which uses Wyner-Ziv coding to protect the video bitstream. In this paper, we propose a low overhead adaptive lossy error protection (ALEP) mechanism that provides a good trade-off between the error resilience and decoded video quality. The proposed method can generate appropriate redundant slices to provide proper error correction capability for varying channel conditions. The proposed method maintains good video quality at low packet loss rate compared to original SLEP and still provides sufficient error correction capability at high packet loss rate in our simulation results. It achieves 2-3 dB PSNR improvement at 5% packet loss rate for various video sequences in our simulations.
    Communications and Information Technology, 2009. ISCIT 2009. 9th International Symposium on; 10/2009
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we propose an estimated high frequency compensated (EHFC) algorithm for super resolution images. It is based on iterative back projection (IBP) method combined with compensated high frequency models according to different applications. The proposed algorithm not only improves the quality of enlarged images produced by zero-order, bilinear, or bicubic interpolation methods, but also accelerates the convergence speed of IBP. In experiments with general tested images, EHFC method can increase the speed by 1 ~ 6.5 times and gets 0.4 ~ 0.7 dB PSNR gain. In text image tests, EHFC method can increase 1.5 ~ 6.5 times in speed and 1.2 ~ 8.3 dB improvement in PSNR.
    Communications and Information Technology, 2009. ISCIT 2009. 9th International Symposium on; 10/2009
  • Source
    Chu-Chuan Lee, Ya-Ju Yu, Pao-Chi Chang
    [Show abstract] [Hide abstract]
    ABSTRACT: The video streaming applications are full of potentials in the IP dual stack network that supports IPv4 and IPv6 protocols simultaneously. However, the significances of video packets belonged to various video sequences are different. An equal error protection to all video packets in the IP network will degrade the video quality significantly. This paper proposes an Adaptive Significance Determination Mechanism in Temporal and Spatial domains (ASDM-TS) for H.264 videos over IP dual stack network with DiffServ model. ASDM-TS determines the video packet significance simultaneously in temporal and spatial domains. From the temporal domain, ASDM-TS evaluates the packet significance based on the estimated error propagation if a packet is lost. From the spatial domain, ASDM-TS computes the packet significance based on the content complexity belonging to a packet. Moreover, ASDM-TS is adaptive to various video sequences with a self-learning method. Compared with traditional schemes, simulation results show that the proposed scheme significantly improves the accuracy of signification determination up to 15% and effectively improves the received video quality up to 0.7 dB in PSNR.
    Communications and Networking in China, 2009. ChinaCOM 2009. Fourth International Conference on; 09/2009
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Media encryption technologies actively play the first line of defense in securing the access of multimedia data. Traditional cryptographic encryption can achieve provable security but is unfortunately sensitive to a single bit error, which will cause an unreliable packet to be dropped creating packet loss. In order to achieve robust media encryption, the requirement of error resilience can be achieved with error-resilient media transmission. This study proposes a video joint encryption and transmission (video JET) scheme by exploiting media hash-embedded residual data to achieve motion estimation and compensation for recovering lost packets, while maintaining format compliance and cryptographic provable security. Interestingly, since video block hash preserves the condensed content to facilitate search of similar blocks, motion estimation is implicitly performed through robust media hash matching - which is the unique characteristic of our method. We analyze and compare the performance of resilience to (bursty) packet loss between the proposed method and forward error correction (FEC), which has been extensively employed to protect video packets over error-prone networks. The feasibility of our packet loss- resilient video JET approach is further demonstrated through experimental results.
    Multimedia Tools and Applications 09/2009; 44:249-278. DOI:10.1007/s11042-009-0295-7 · 1.06 Impact Factor