[Show abstract][Hide abstract] ABSTRACT: First version of the latest video coding standard, High Efficiency Video Coding (HEVC), only supports coding of video in YUV 4:2:0 chroma format. An extension of the standard that will support other chroma formats is currently under de-velopment, however, version 1 decoders will not be able to handle the bitstreams created using this extension. In this pa-per, we propose a novel method to create scalable bitstreams that involve a backward compatible base layer in 4:2:0 for-mat that can be handled by HEVC version 1 decoders and code additional layers to enhance the chroma resolution. The proposal codes 4:2:0 video in the base layer and the high resolution chroma components as auxiliary pictures as separate enhancement layers. The high resolution chroma components could optionally be predicted from the upsampled 4:2:0 chroma components of base layer. The simulations show that the proposed method achieves scalability with 9.5% coding efficiency penalty on average compared to single layer coding of 4:4:4 video. When compared to simulcast of 4:2:0 and 4:4:4 video, proposed method provides 38% gain on average. Proposed method makes services using high chroma fidelity easier to be deployed, due to the backwards compatibility to existing HEVC implementations with high coding efficiency.
International Conference on Image Processing (ICIP); 10/2014
[Show abstract][Hide abstract] ABSTRACT: Color gamut scalability refers to coding a video in a layered manner where the base and enhancement layers are coded in different color gamut spaces. Color gamut scalability and its relationship with spatial scalability are currently being studied for the scalable extension of HEVC (SHVC) to enable coding of ultra-high definition content having BT.2020 color gamut with 10-bit precision as an enhancement layer and high definition content having BT.709 color gamut with 8-bit precision as the base layer. In this paper, we propose to use the weighted prediction tool of the SHVC standard to map the color gamut of the base layer to the enhancement layer. In addition, we also propose a high-precision bit-depth mapping of the base layer to the enhancement layer that jointly performs upsampling with a bit-depth increase. Simulation results show that these two schemes improve the coding efficiency of the All Intra and Random Access configurations by about 6.8% and 3.6% on average, respectively, compared to a basic scheme where the bit-depth of the base layer is increased by simple bit-shifting. These gains are achieved by imposing no changes to the SHVC standard; hence make the proposed method very useful for practical use-cases as well.
ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); 05/2014
[Show abstract][Hide abstract] ABSTRACT: Coding efficiency gains in the new High Efficiency Video Coding (H.265/HEVC) video coding standard are achieved by improving many aspects of the traditional hybrid coding framework. Motion compensated prediction, and in particular the interpolation filter, is one area that was improved significantly over H.264/AVC. This paper presents the details of the interpolation filter design of the H.265/HEVC standard. First, the improvements of H.265/HEVC interpolation filtering over H.264/AVC are presented. These improvements include novel filter coefficient design with an increased number of taps and utilizing higher precision operations in interpolation filter computations. Then, the computational complexity is analyzed, both from theoretical and practical perspectives. Theoretical complexity analysis is done by studying the worst-case complexity analytically, whereas practical analysis is done by profiling an optimized decoder implementation. Coding efficiency improvements over the H.264/AVC interpolation filter are studied and experimental results are presented. They show a 4.0% average bitrate reduction for the luma component and 11.3% average bitrate reduction for the chroma components. The coding efficiency gains are significant for some video sequences and can reach up to 21.7%.
IEEE Journal of Selected Topics in Signal Processing 12/2013; 7(6):946-956. · 3.63 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Dynamically changing the spatial resolution in a video conferencing session is useful for seamlessly adapting the bitrate to changing network conditions and for improving the user experience. Similar to earlier standards, the emerging High Efficiency Video Coding (H.265/HEVC) standard does not allow prediction across different resolutions, so an Instantaneous Decoding Refresh (IDR) picture must be sent to reinitialize the stream when a resolution change happens. IDR pictures take significantly more bits compared to predictively coded pictures. Thus, using them for resolution switching significantly reduces coding efficiency and increases the delay. In this paper we propose a method to support efficient adaptive resolution change using the emerging scalable H.265/HEVC standard. The proposed approach utilizes the inter-layer predicted random access pictures at the enhancement layer for resolution switching, instead of IDR pictures. The experimental results show that when the proposed method was used, the bitrate was reduced at the switching point by 34% on average for the tested video sequences. In addition, visual examples are shown demonstrating the improved visual quality with the proposed method.
2013 20th IEEE International Conference on Image Processing (ICIP); 09/2013
[Show abstract][Hide abstract] ABSTRACT: Segmentation of an object from a video is a challenging task in multimedia
applications. Depending on the application, automatic or interactive methods
are desired; however, regardless of the application type, efficient computation
of video object segmentation is crucial for time-critical applications;
specifically, mobile and interactive applications require near real-time
efficiencies. In this paper, we address the problem of video segmentation from
the perspective of efficiency. We initially redefine the problem of video
object segmentation as the propagation of MRF energies along the temporal
domain. For this purpose, a novel and efficient method is proposed to propagate
MRF energies throughout the frames via bilateral filters without using any
global texture, color or shape model. Recently presented bi-exponential filter
is utilized for efficiency, whereas a novel technique is also developed to
dynamically solve graph-cuts for varying, non-lattice graphs in general linear
filtering scenario. These improvements are experimented for both automatic and
interactive video segmentation scenarios. Moreover, in addition to the
efficiency, segmentation quality is also tested both quantitatively and
qualitatively. Indeed, for some challenging examples, significant time
efficiency is observed without loss of segmentation quality.
IEEE Transactions on Multimedia 01/2013; · 1.78 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Coding efficiency gains in the High Efficiency Video Coding (H.265/HEVC) standard are achieved by improving many aspects of the traditional hybrid coding framework. Motion compensated prediction, and in particular the interpolation filter, is one of the areas that was improved significantly over H.264/AVC. This paper presents the details of the motion compensation interpolation filter design of the H.265/HEVC standard and its improvements over the interpolation filter design of H.264/AVC. These improvements include discrete cosine transform based filter coefficient design, utilizing longer filter taps for luma and chroma interpolation and using higher precision operations in the intermediate computations. The computational complexity of HEVC interpolation filter is also analyzed both from theoretical and practical perspectives. Experimental results show that a 4.5% average bitrate reduction for the luma component and 13.0% average bitrate reduction for the chroma components are achieved compared to interpolation filter of H.264/AVC. The coding efficiency gains are significant for some video sequences and can reach up to 21.7%.
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on; 01/2013
[Show abstract][Hide abstract] ABSTRACT: High Efficiency Video Coding (HEVC) standard introduced an increased number of intra prediction directions in order to improve intra prediction performance by efficiently modeling the directional structures found in typical video contents. Efficient coding of intra prediction mode information is realized through a Most Probable Mode (MPM) list approach. In a scalable system, due to high correlation between the layers, utilization of base layer intra prediction mode can improve coding performance. In this paper, we propose a new intra prediction mode coding algorithm for scalable extension of HEVC where only the difference between the intra prediction modes of base and enhancement layers is coded. We provide experimental results and also a comparison of the proposed algorithm with an MPM list based approach where base layer intra prediction mode is added to the list as the most probable mode. Experimental results show BD-rate gains up to 1.1% in 2x spatial scalability and 0.7% in 1.5x scalability for all intra configuration.
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on; 01/2013
[Show abstract][Hide abstract] ABSTRACT: The joint development of the upcoming High Efficiency Video Coding (HEVC) standard by ITU-T Video Coding Experts Group and ISO/IEC Moving Picture Experts Group marks a new step in video compression capability. In technical terms, HEVC is a hybrid video-coding approach using quadtree-based block partitioning together with motion-compensated prediction. Even though a high degree of adaptability is achieved by quadtree-based block partitioning, this approach has certain intrinsic drawbacks, which may result in redundant sets of motion parameters being transmitted. Previous work has shown that those redundancies can effectively be removed by merging the leafs of a particular quadtree structure. Following this concept, a block merging algorithm for HEVC is now proposed. This algorithm generates a single motion parameter set for a whole region of contiguous motion-compensated blocks. In this paper, we describe the various components of the proposed block merging algorithm and, using experimental evidence, demonstrate their benefits in terms of coding efficiency.
IEEE Transactions on Circuits and Systems for Video Technology 12/2012; 22(12):1720-1731. · 2.26 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: This paper provides an overview of the intra coding techniques in the High Efficiency Video Coding (HEVC) standard being developed by the Joint Collaborative Team on Video Coding (JCT-VC). The intra coding framework of HEVC follows that of traditional hybrid codecs and is built on spatial sample prediction followed by transform coding and postprocessing steps. Novel features contributing to the increased compression efficiency include a quadtree-based variable block size coding structure, block-size agnostic angular and planar prediction, adaptive pre- and postfiltering, and prediction direction-based transform coefficient scanning. This paper discusses the design principles applied during the development of the new intra coding methods and analyzes the compression performance of the individual tools. Computational complexity of the introduced intra prediction algorithms is analyzed both by deriving operational cycle counts and benchmarking an optimized implementation. Using objective metrics, the bitrate reduction provided by the HEVC intra coding over the H.264/advanced video coding reference is reported to be 22% on average and up to 36%. Significant subjective picture quality improvements are also reported when comparing the resulting pictures at fixed bitrate.
IEEE Transactions on Circuits and Systems for Video Technology 12/2012; 22(12). · 2.26 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Efficient and accurate interactive image segmentation have significant importance in many multimedia applications. For mobile touchscreen-based applications, efficiency is more crucial. Moreover, due to small screens of the mobile devices, error tolerance is also a crucial factor. In this paper, a method for interactive image segmentation, tailored for mobile touch screen devices, is proposed. As an interaction methodology, coloring is presented. An automatic stroke-error correction methodology to correct the inaccurate user interaction is also proposed. For the efficient computation of the solution, a novel dynamic and iterative graph-cut solution is formulated. Efficiency and error tolerance of the proposed method are tested by using various sample images. Subjective evaluation of the interactive segmentation algorithms for mobile-touch screen is also performed. Indeed, for the challenging examples, the superior performance of the proposed method is obtained by the experiments.
Proceedings of the 2nd ACM international workshop on Interactive multimedia on mobile and portable devices; 11/2012
[Show abstract][Hide abstract] ABSTRACT: We propose a complete still image based 2D-3D mobile conversion system for touch screen use. The system consists of interactive segmentation followed by 3D rendering. The interactive segmentation is conducted dynamically by color Gaussian mixture model updates and dynamic-iterative graph-cut. A coloring gesture is used to guide the way and entertain the user during the process. Output of the image segmentation is then fed to the 3D rendering stage of the system. For rendering stage, two novel improvements are proposed to handle holes resulting from depth image based rendering process. These improvements are also expected to enhance the 3D perception. These two methods are subjectively tested and their results are presented.
Image Processing (ICIP), 2012 19th IEEE International Conference on; 10/2012
[Show abstract][Hide abstract] ABSTRACT: In this paper, a novel algorithm called spatially varying transform (SVT) is proposed to improve the coding efficiency of video coders. SVT enables video coders to vary the position of the transform block, unlike state-of-art video codecs where the position of the transform block is fixed. In addition to changing the position of the transform block, the size of the transform can also be varied within the SVT framework, to better localize the prediction error so that the underlying correlations are better exploited. It is shown in this paper that by varying the position of the transform block and its size, characteristics of prediction error are better localized, and the coding efficiency is thus improved. The proposed algorithm is implemented and studied in the H.264/AVC framework. We show that the proposed algorithm achieves 5.85% bitrate reduction compared to H.264/AVC on average over a wide range of test set. Gains become more significant at medium to high bitrates for most tested sequences and the bitrate reduction may reach 13.50%, which makes the proposed algorithm very suitable for future video coding solutions focusing on high fidelity video applications. The gain in coding efficiency is achieved with a similar decoding complexity which makes the proposed algorithm easy to be incorporated in video codecs. However, the encoding complexity of SVT can be relatively high because of the need to perform a number of rate distortion optimization (RDO) steps to select the best location parameter (LP), which indicates the position of the transform. In this paper, a novel low complexity algorithm is also proposed, operating on a macroblock and a block level, to reduce the encoding complexity of SVT. Experimental results show that the proposed low complexity algorithm can reduce the number of LPs to be tested in RDO by about 80% with only a marginal penalty in the coding efficiency.
IEEE Transactions on Circuits and Systems for Video Technology 03/2011; · 2.26 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: In this paper we introduce a novel concept for Intra coding of pictures especially suitable for representing smooth image segments. Traditional block based transform coding methods cause visually annoying blocking artifacts for image segments with gradually changing smooth content. The proposed solution overcomes this drawback by defining a fully continuous surface of sample values approximating the original image. The gradient of the surface is indicated by transmitting values for selected control points within the image segment and the surface itself is obtained by interpolating sample values in-between the control points. This approach is found to provide up to 30 percent bitrate reductions in the case of natural imagery and it has also been adopted to the initial HEVC codec design by JCT-VC.
[Show abstract][Hide abstract] ABSTRACT: This paper describes a low complexity video codec with high coding efficiency. It was proposed to the high efficiency video coding (HEVC) standardization effort of moving picture experts group and video coding experts group, and has been partially adopted into the initial HEVC test model under consideration design. The proposal utilizes a quadtree-based coding structure with support for macroblocks of size 64 × 64, 32 × 32, and 16 × 16 pixels. Entropy coding is performed using a low complexity variable length coding scheme with improved context adaptation compared to the context adaptive variable length coding design in H.264/AVC. The proposal's interpolation and deblocking filter designs improve coding efficiency, yet have low complexity. Finally, intra-picture coding methods have been improved to provide better subjective quality than H.264/AVC. The subjective quality of the proposed codec has been evaluated extensively within the HEVC project, with results indicating that similar visual quality to H.264/AVC High Profile anchors is achieved, measured by mean opinion score, using significantly fewer bits. Coding efficiency improvements are achieved with lower complexity than the H.264/AVC Baseline Profile, particularly suiting the proposal for high resolution, high quality applications in resource-constrained environments.
IEEE Transactions on Circuits and Systems for Video Technology 01/2011; · 2.26 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Spatially Varying Transform (SVT) is a technique introduced earlier to improve the coding efficiency of video coders . SVT allows the position of the transform block within the macroblock to vary in order to better localize the underlying residual signal. The coding gains of SVT come with increased encoding complexity due to the additional need in the encoder to search for the best Location Parameter (LP) which indicates the position of the transform. In this paper, a new technique called Prediction Signal Aided Spatially Varying Transform (PSASVT) is proposed that utilizes the gradient of prediction signal to eliminate the unlikely LPs. As the number of candidate LPs is reduced, a smaller number of LPs are searched by encoder, which reduces the encoding complexity. In addition, less overhead bits are needed to code the selected LP and thus the coding efficiency can be improved. Experimental results show that the number of LPs to be tested in RDO is reduced on average by more than 20%. This reduction in encoding complexity is achieved with a slight increase in coding efficiency, as the number of candidate LPs is reduced. The decoding complexity increase is only a little.
Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, ICME 2011, 11-15 July, 2011, Barcelona, Catalonia, Spain; 01/2011
[Show abstract][Hide abstract] ABSTRACT: In this paper, we propose a novel algorithm, named as Spatially Varying Transform (SVT). The basic idea of SVT is that we
do not restrict the transform coding inside normal block boundary but adjust it to the characteristics of the prediction error.
With this flexibility, we are able to achieve coding efficiency improvement by selecting and coding the best portion of the
prediction error in terms of rate distortion tradeoff. The proposed algorithm is implemented and studied in the H.264/AVC
framework. We show that the proposed algorithm achieves 2.64% bit-rate reduction compared to H.264/AVC on average over a wide
range of test set. Gains become more significant at high bit-rates and the bit-rate reduction can be up to 10.22%, which makes
the proposed algorithm very suitable for future video coding solutions focusing on high fidelity applications. The decoding
complexity is expected to be decreased because only a portion of the prediction error needs to be decoded.
IEEE Trans. Circuits Syst. Video Techn. 01/2011; 21:127-140.
[Show abstract][Hide abstract] ABSTRACT: In this paper, an intra coding method is proposed aiming to efficiently code the smooth regions in the depth map images. One of the characteristics of typical depth map images is that they contain large gradually changing smooth areas. Traditional block based transform coding algorithms are causing blocking artifacts for this type of regions. Instead of using transform coding, the proposed method utilizes piecewise linear planar representations to code the smooth areas in the depth map picture. The piecewise linear representation for a block is obtained using the already decoded pixels neighboring the block and the block’s bottom-right sample. The experimental results show that significant compression improvement can be achieved with depth sequences containing smooth regions. By visual inspection, it is also shown that the blocking artifacts are significantly reduced.
[Show abstract][Hide abstract] ABSTRACT: Stereoscopic D video is becoming a reality in many application areas, ranging from high quality entertainment to mobile video services. Due to the need to process two views, the complexity of D video applications is significantly higher than traditional 2D counterparts. In order to enable real-time D video services in mobile devices, this paper proposes a novel algorithm which reduces the complexity of stereo video encoding with improvement of coding efficiency. A novel search window center prediction method is proposed that exploits the correlation between two views. Experimental results show that the average encoding time of the second view can be decreased by 80% with an increase in coding efficiency of up to 2%. The state-of-art fast motion estimation methods for stereoscopic D video encoding show coding efficiency decrease, whereas proposed method achieves the speed-up with increase in coding efficiency, making it suitable for high qality D video applications.
Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, ICME 2011, 11-15 July, 2011, Barcelona, Catalonia, Spain; 01/2011
[Show abstract][Hide abstract] ABSTRACT: New video coding solutions, such as the HEVC (High Efficiency Video Coding) standard being developed by JCT-VC (Joint Collaborative Team on Video Coding), are typically designed for high resolution video content. Increasing video resolution creates two basic requirements for practical video codecs; those need to be able to provide compression efficiency superior to prior video coding solutions and the computational requirements need to be aligned with the foreseeable hardware platforms. This paper proposes an intra prediction method which is designed to provide high compression efficiency and which can be implemented effectively in resource constrained environments making it applicable to wide range of use cases. When designing the method, special attention was given to the algorithmic definition of the prediction sample generation, in order to be able to utilize the same reconstruction process at different block sizes. The proposed method outperforms earlier variations of the same family of technologies significantly and consistently across different classes of video material, and has recently been adopted as the directional intra prediction method for the draft HEVC standard. Experimental results show that the proposed method outperforms the H.264/AVC intra prediction approach on average by 4.8 %. For sequences with dominant directional structures, the coding efficiency gains become more significant and exceed 10 %.
IEEE 13th International Workshop on Multimedia Signal Processing (MMSP 2011), Hangzhou, China, October 17-19, 2011; 01/2011