Pao-Chi Chang

National Central University, Taoyuan City, Taiwan, Taiwan

Are you Pao-Chi Chang?

Claim your profile

Publications (82)34.07 Total impact

  • Shih-Wei Sun · Chien-Hao Kuo · Pao-Chi Chang
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes a pairwise trajectory matching scheme from multiple cameras for people tracking, handling the mistracking situations caused by occlusion events occurred in one of the cameras. In a multiple cameras environment, a geometric calibration process is necessary for the co-plane of the overlapping field of views from different cameras as the initial step. Once the geometry is calibrated, according to the 2D positions of the analyzed foot joints from the depth cameras. Homography transformation is applied to project the detected foot points from different views into a synergistic virtual bird’s eye view for people tracking. At the virtual bird’s eye view, the people tracking results from each of the cameras based on Kalman filter are fused according to the proposed pairwise trajectory matching scheme. The contribution of this paper is trifold: (1) The proposed hand-gesture-triggered calibration process with temporally synchronization capability can effectively build and calibrate the geometry in a region of interest. (2) The proposed interleaving-based skeleton obtaining and moving average based valid skeleton determination can extend the skeleton tracking capability to track more people. (3) The proposed pairwise trajectory matching scheme effectively manages occlusion situations happened in one of the depth cameras. In addition, in the extensive experimental results, the proposed method can track up to six simultaneously freely moving persons in the field of view, with affordable complexity for real-time applications. Furthermore, the infrared-based depth cameras track people satisfactorily from bright to extremely dark environments.
    No preview · Article · Dec 2015 · Journal of Visual Communication and Image Representation
  • [Show abstract] [Hide abstract]
    ABSTRACT: Rate control that is required to regulate the bitrate of video coding is critical to time-sensitive video applications used over networks. However, the H.264/AVC standard does not respond to scene changes, and this causes the transmission quality to deteriorate as a scene change occurs. In this work, a scene change is detected by comparing the ratio of the sum of absolute difference (SAD) between two consecutive frames. As the scene change is detected, the proposed method, which is modified from the reference software of H.264/AVC, re-assigns a quantization parameter (QP) value to regulate the bitrate. Because the inter-prediction works poorly for the scene-changed frame, the proposed method estimates its frame complexity based on the content, and further creates another Q-R model to assign QP. The adaptive rate control mechanism presented in this study can quickly respond to the heavy bitrate increment caused by a change of scene. Simulation results show that the proposed method improves the average peak signal noise ratio (PSNR) to approximately 1.1dB, with a smaller buffer size compared with the performance of the reference software JM version 17.2.
    No preview · Article · Dec 2014 · IEICE Transactions on Fundamentals of Electronics Communications and Computer Sciences
  • [Show abstract] [Hide abstract]
    ABSTRACT: The new video coding standard, high-efficiency video coding, adopts a quadtree structure to provide variable transform sizes in the transform coding process. The heuristic examination of transform unit (TU) modes substantially increases the computational complexity, compared to previous video coding standards. Thus, efficiently reducing the TU candidate modes is crucial. In the proposed similarity-check scheme, sub-TU blocks are categorized into a strongly similar case or a weakly similar case, and the early TU termination or early TU splitting procedure is performed. For the strongly similar case, a property called zero-block inheritance combined with a zero-block detection technique is applied to terminate the TU search process early. For the weakly similar case, the gradients of residuals representing the similarity of coefficients are used to skip the current TU mode or stop the TU splitting process. In particular, the computation time is further reduced because all the required information for the proposed mode decision criteria is derived before performing the transform coding. The experimental results revealed that the proposed algorithm can save similar to 64% of the TU encoding time on average in the interprediction, with a negligible rate-distortion loss. (C) 2014 SPIE and IS&T
    No preview · Article · Nov 2014 · Journal of Electronic Imaging
  • Source
    Ren-Jie Wang · Chih-Wei Huang · Pao-Chi Chang
    [Show abstract] [Hide abstract]
    ABSTRACT: Downsampling video coding, whereby downsampled frames are encoded, provides improved perceptual quality in rate-constrained situations. This method shows considerable advantages over other approaches, particularly in wide-spreading high-definition video formats. This paper provides a comprehensive analysis of downsampling video coding. The study proposes a spatially scalable rate-distortion (RD) model, comprising quantization-distortion and quantization-rate models, and develops an optimal encoding frame size determination framework. The proposed method achieves a gain up to 2.3 dB peak signal-to-noise ratio (PSNR) at 1 Mb/s when compared with conventional full frame size coding. The RD performance is close to the optimal scenario, in which the ideal frame size is obtained by heuristically performing downsampling coding in various allowable sizes.
    Full-text · Article · Nov 2014 · IEEE Transactions on Circuits and Systems for Video Technology
  • Ren-Jie Wang · Ya-Ting Yang · Pao-Chi Chang
    [Show abstract] [Hide abstract]
    ABSTRACT: Efficient multimedia retrieval has become a vital issue because more audio and video data are now available. This paper focuses on content-based image retrieval (CBIR) in the compression domain (CPD). The retrieval features are extracted based on I-frame coding information in H.264. This paper proposes using a local mode histogram as the texture feature to match images and applying the residual coefficients to filter non-confident modes. The geometrical correspondence between two images is also considered. The experimental results show that the proposed method can substantially reduce computational and memory resource consumption, and provides similar performance compared with methods that extract features from decompressed images.
    No preview · Article · Jul 2014 · Journal of Visual Communication and Image Representation
  • [Show abstract] [Hide abstract]
    ABSTRACT: H.264 scalable extension (H.264/SVC) is the current state-of-the-art standard of the scalable video coding. Its interlayer prediction provides higher coding efficiency than previous standards. Since the standard was proposed, several attempts have been made to improve the performance based on its coding structure. Quantization-distortion (Q-D) modeling is a fundamental issue in video coding; therefore, this paper proposes new Q-D models for three interlayer predictions in 264/SVC spatial scalability, that is, interlayer motion prediction, intraprediction, and residual prediction. An existing single layer offline Q-D model is extended to H.264/SVC spatial scalable coding. In the proposed method, the residual power from the interlayer prediction is decomposed into the coding distortion and the prediction distortion. The prediction distortion is the mean square error (MSE) between two original signals that can be obtained by preprocessing with low complexity. Therefore, the coding distortion can be estimated based on both the quantization parameter (QP) and a precalculated prediction distortion before the encoding process. Consequently, the estimated quality based on the proposed models achieved a high accuracy of over 90% for the three interlayer predictions in average.
    No preview · Article · Jun 2014 · IEEE Transactions on Broadcasting
  • Source

    Preview · Article · Apr 2014

  • No preview · Conference Paper · Feb 2014
  • Ren-Jie Wang · Chih-Wei Huang · Pao-Chi Chang
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we present the efficiency analysis of two sophisticated coding tools, multi-reference frame (MRF) and variable block size (VBS) that cost high computational complexity, in various spatial resolutions. The relationship between coding efficiency and spatial resolution is theoretically discovered. Based on the conclusion of the efficiency analysis that the efficiency improvement from sophisticated coding tools are gradually decreased in higher resolution, we propose an adaptive coding configuration for the encoding with various resolutions that yield significant complexity reduction with negligible RD performance decrease.
    No preview · Conference Paper · Jun 2013
  • Yueh-Chuan Lu · Zong-Yi Chen · Pao-Chi Chang
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes a low power multi-Lane Mobile Industry Processor Interface (MIPI) Camera Serial Interface 2 (CSI-2) receiver architecture which adopts an 8-Byte parallel CSI protocol layer for hardware implementations. The proposed scheme can work in environment with 4 data Lanes and 1 Gb/s per data Lane, i.e. with maximum data rate 4 Gb/s, at 62.5 MHz which increases logic operations from 8 ns (125 MHz) to 16 ns (62.5 MHz) without throughput degradation. Therefore, the supply voltage (1.2 V) can be reduced and the power consumption can also be reduced. The proposed architecture is implemented by 0.13 μm CMOS technology and the total gate count is 32.7 K. It not only reduces the operating clock rate but also reduces more than 37%~43% logic power consumption measured in chip.
    No preview · Conference Paper · Jan 2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: Rate control is critical to time sensitive video applications over networks. However, the H.264/AVC standard takes no particularly response to the scene change which causes transmission quality deterioration significantly. In this work, we propose a robust mechanism of the rate control that can quickly respond to a scene change. In our proposed mechanism, it first allocates the remaining frames as a transition GOP. Then, according to the buffer fullness, it estimates the target bits so that a QP (quantization parameter) value can be determined. Simulation results show that our proposed method improves the average PSNR (peak signal noise ratio) about 1.1 dB with less buffer size, compared with the performance of JM in version 17.2.
    No preview · Conference Paper · Jan 2013
  • Chien-Hao Kuo · Shih-Wei Sun · Pao-Chi Chang
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we propose a pairwise curve matching scheme in a multi-camera environment to handle the mis-tracking issue caused by occlusion problem happened in a single camera. According to the skeleton/joints of a human subject analyzed from a depth camera (e.g., Kinect), based the foot points (joints) used for people tracking in a field of view, we apply homography transformation to project the foot points from different views to a virtual bird's eye view, using Kalman filter to achieve people tracking with a pairwise curve matching. The contribution of this paper is trifold: (a) the proposed pair-wise curve matching scheme can handle the occlusion problem happened in one of the cameras, (b) the complexity of the proposed scheme is low and affordable to be implemented in a realtime application, and (c) the implementation on a Kinect camera can provide satisfactory tracking results in a bright or extremely dark environment due to the skeletons/joints analyzed by the coded structured light-based infrared (IR) sensor.
    No preview · Conference Paper · Jan 2013
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Video quality under rate constraint is mainly controlled by the frame rate and the quantization parameter. This work proposes a mechanism to obtain the optimal frame rate that maximizes video quality under rate constraint. Based on an objective metric of video quality that can reflect subjective quality, this work first proposes a video quality—frame rate—rate constraint model. Second, the relationship between model parameters and video characteristics is formulized. Finally, this work proposes an efficient frame rate optimization mechanism. Experimental results show that the optimal frame rate estimated by our mechanism is identical to the actual optimal frame rate under most bit rate constraints for both training sequences and new test sequences. In addition, the quality loss caused by the estimation error is generally limited within 0.8 dB in our experiments.
    Preview · Article · Jun 2012 · IEEE Transactions on Broadcasting
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Scalable Video Coding (SVC) provides an efficient compression for the video bitstream equipped with various scalable configurations. H.264 scalable extension (H.264/SVC) is the most recent scalable coding standard. It involves the state-of-the-art inter-layer prediction to provide higher coding efficiency than previous standards. Moreover, the requirements for the video quality on distinct situations like link conditions or video contents are usually different. Therefore, it is very desirable to be able to construct a model so that the target quality can be estimated in advance. This work proposes a Quantization-Distortion (Q-D) model for H.264/SVC spatial scalability, and then we can estimate video quality before the actual encoding is performed. In particular, we further decompose the residual from the inter-layer residual prediction into the previous distortion and Prior-Residual so that the residual can be estimated. In simulations, based on the proposed model, we estimate the actual Q-D curves, and its average accuracy is 88.79%.
    Preview · Conference Paper · Nov 2011
  • Source
    Ming-Chen Chien · Pao-Chi Chang
    [Show abstract] [Hide abstract]
    ABSTRACT: Real-Time Complexity Control for H.264 Video Encoding by Coding Gain Maximization
    Preview · Article · Jul 2011 · IEICE Transactions on Communications
  • Source
    Hua-Chang Chung · Zong-Yi Chen · Pao-Chi Chang
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes a low power deblocking filter (DF) architecture with Horizontal Edge Skip Processing Architecture (HESPA) scheme that offers an intelligent edge skip aware mechanism in filtering the horizontal edges by adopting a four-stage pipeline and adaptive hybrid filtering order to boost the speed of DF process. The proposed architecture not only reduces more than 34% logic power consumption measured in FPGA but also saves the filtering processes down to 100 clock cycles per macroblock (MB). The system throughput can easily support 1080HD video format at 30 fps with 70MHz clock frequency for low power and high definition video applications. It is implemented on 0.18μm standardized cell library, which consumes only 19.8K gates at a clock frequency of 200 MHz.
    Preview · Conference Paper · Feb 2011
  • Source
    Zong-Yi Chen · Jhe-Wei Syu · Pao-Chi Chang
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes a fast inter-layer motion estimation algorithm on spatial scalability for scalable video coding extension of H.264/AVC. In the enhancement layer motion estimation, we utilize the relation between two motion vector predictors from the base layer and the enhancement layer respectively to reduce the number of searches. Additionally, we utilize the mode correlations of temporal direction motion estimation to save more encoding time. The simulation results show that the proposed algorithm can save the computation time up to 67.4% compared with JSVM9.12 with less than 0.0476dB video quality degradation.
    Preview · Conference Paper · Aug 2010
  • Source
    Ren-Jie Wang · Ming-Chen Chien · Pao-Chi Chang
    [Show abstract] [Hide abstract]
    ABSTRACT: Down-sampling coding, which sub-samples the image and encodes the smaller sized images, is one of the solutions to raise the image quality at insufficiently high rates. In this work, we propose an Adaptive Down-Sampling (ADS) coding for H.264/AVC. The overall system distortion can be analyzed as the sum of the down-sampling distortion and the coding distortion. The down-sampling distortion is mainly the loss of the high frequency components that is highly dependent of the spatial difference. The coding distortion can be derived from the classical Rate-Distortion theory. For a given rate and a video sequence, the optimum down-sampling resolution-ratio can be derived by utilizing the optimum theory toward minimizing the system distortion based on the models of the two distortions. This optimal resolution-ratio is used in both down-sampling and up-sampling processes in ADS coding scheme. As a result, the rate-distortion performance of ADS coding is always higher than the fixed ratio coding or H.264/AVC by 2 to 4 dB at low to medium rates.
    Preview · Article · Feb 2010 · Proceedings of SPIE - The International Society for Optical Engineering
  • Source
    Zong-Yi Chen · Tien-Hsu Lee · Pao-Chi Chang
    [Show abstract] [Hide abstract]
    ABSTRACT: Systematic Lossy Error Protection (SLEP) is a robust error resilient mechanism which uses Wyner-Ziv coding to protect the video bitstream. In this paper, we propose a low overhead adaptive lossy error protection (ALEP) mechanism that provides a good trade-off between the error resilience and decoded video quality. The proposed method can generate appropriate redundant slices to provide proper error correction capability for varying channel conditions. The proposed method maintains good video quality at low packet loss rate compared to original SLEP and still provides sufficient error correction capability at high packet loss rate in our simulation results. It achieves 2-3 dB PSNR improvement at 5% packet loss rate for various video sequences in our simulations.
    Preview · Conference Paper · Oct 2009
  • Source
    Jong-Tzy Wang · Kai-Wen Liang · Shu-Fan Chang · Pao-Chi Chang
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we propose an estimated high frequency compensated (EHFC) algorithm for super resolution images. It is based on iterative back projection (IBP) method combined with compensated high frequency models according to different applications. The proposed algorithm not only improves the quality of enlarged images produced by zero-order, bilinear, or bicubic interpolation methods, but also accelerates the convergence speed of IBP. In experiments with general tested images, EHFC method can increase the speed by 1 ~ 6.5 times and gets 0.4 ~ 0.7 dB PSNR gain. In text image tests, EHFC method can increase 1.5 ~ 6.5 times in speed and 1.2 ~ 8.3 dB improvement in PSNR.
    Preview · Conference Paper · Oct 2009