[show abstract][hide abstract] ABSTRACT: In this paper, a novel algorithm called spatially varying transform (SVT) is proposed to improve the coding efficiency of video coders. SVT enables video coders to vary the position of the transform block, unlike state-of-art video codecs where the position of the transform block is fixed. In addition to changing the position of the transform block, the size of the transform can also be varied within the SVT framework, to better localize the prediction error so that the underlying correlations are better exploited. It is shown in this paper that by varying the position of the transform block and its size, characteristics of prediction error are better localized, and the coding efficiency is thus improved. The proposed algorithm is implemented and studied in the H.264/AVC framework. We show that the proposed algorithm achieves 5.85% bitrate reduction compared to H.264/AVC on average over a wide range of test set. Gains become more significant at medium to high bitrates for most tested sequences and the bitrate reduction may reach 13.50%, which makes the proposed algorithm very suitable for future video coding solutions focusing on high fidelity video applications. The gain in coding efficiency is achieved with a similar decoding complexity which makes the proposed algorithm easy to be incorporated in video codecs. However, the encoding complexity of SVT can be relatively high because of the need to perform a number of rate distortion optimization (RDO) steps to select the best location parameter (LP), which indicates the position of the transform. In this paper, a novel low complexity algorithm is also proposed, operating on a macroblock and a block level, to reduce the encoding complexity of SVT. Experimental results show that the proposed low complexity algorithm can reduce the number of LPs to be tested in RDO by about 80% with only a marginal penalty in the coding efficiency.
IEEE Transactions on Circuits and Systems for Video Technology 03/2011; · 1.82 Impact Factor
[show abstract][hide abstract] ABSTRACT: In this paper we introduce a novel concept for Intra coding of pictures especially suitable for representing smooth image segments. Traditional block based transform coding methods cause visually annoying blocking artifacts for image segments with gradually changing smooth content. The proposed solution overcomes this drawback by defining a fully continuous surface of sample values approximating the original image. The gradient of the surface is indicated by transmitting values for selected control points within the image segment and the surface itself is obtained by interpolating sample values in-between the control points. This approach is found to provide up to 30 percent bitrate reductions in the case of natural imagery and it has also been adopted to the initial HEVC codec design by JCT-VC.
[show abstract][hide abstract] ABSTRACT: Spatially Varying Transform (SVT) is a technique introduced earlier to improve the coding efficiency of video coders . SVT allows the position of the transform block within the macroblock to vary in order to better localize the underlying residual signal. The coding gains of SVT come with increased encoding complexity due to the additional need in the encoder to search for the best Location Parameter (LP) which indicates the position of the transform. In this paper, a new technique called Prediction Signal Aided Spatially Varying Transform (PSASVT) is proposed that utilizes the gradient of prediction signal to eliminate the unlikely LPs. As the number of candidate LPs is reduced, a smaller number of LPs are searched by encoder, which reduces the encoding complexity. In addition, less overhead bits are needed to code the selected LP and thus the coding efficiency can be improved. Experimental results show that the number of LPs to be tested in RDO is reduced on average by more than 20%. This reduction in encoding complexity is achieved with a slight increase in coding efficiency, as the number of candidate LPs is reduced. The decoding complexity increase is only a little.
Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, ICME 2011, 11-15 July, 2011, Barcelona, Catalonia, Spain; 01/2011
[show abstract][hide abstract] ABSTRACT: Stereoscopic D video is becoming a reality in many application areas, ranging from high quality entertainment to mobile video services. Due to the need to process two views, the complexity of D video applications is significantly higher than traditional 2D counterparts. In order to enable real-time D video services in mobile devices, this paper proposes a novel algorithm which reduces the complexity of stereo video encoding with improvement of coding efficiency. A novel search window center prediction method is proposed that exploits the correlation between two views. Experimental results show that the average encoding time of the second view can be decreased by 80% with an increase in coding efficiency of up to 2%. The state-of-art fast motion estimation methods for stereoscopic D video encoding show coding efficiency decrease, whereas proposed method achieves the speed-up with increase in coding efficiency, making it suitable for high qality D video applications.
Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, ICME 2011, 11-15 July, 2011, Barcelona, Catalonia, Spain; 01/2011
[show abstract][hide abstract] ABSTRACT: New video coding solutions, such as the HEVC (High Efficiency Video Coding) standard being developed by JCT-VC (Joint Collaborative Team on Video Coding), are typically designed for high resolution video content. Increasing video resolution creates two basic requirements for practical video codecs; those need to be able to provide compression efficiency superior to prior video coding solutions and the computational requirements need to be aligned with the foreseeable hardware platforms. This paper proposes an intra prediction method which is designed to provide high compression efficiency and which can be implemented effectively in resource constrained environments making it applicable to wide range of use cases. When designing the method, special attention was given to the algorithmic definition of the prediction sample generation, in order to be able to utilize the same reconstruction process at different block sizes. The proposed method outperforms earlier variations of the same family of technologies significantly and consistently across different classes of video material, and has recently been adopted as the directional intra prediction method for the draft HEVC standard. Experimental results show that the proposed method outperforms the H.264/AVC intra prediction approach on average by 4.8 %. For sequences with dominant directional structures, the coding efficiency gains become more significant and exceed 10 %.
IEEE 13th International Workshop on Multimedia Signal Processing (MMSP 2011), Hangzhou, China, October 17-19, 2011; 01/2011
[show abstract][hide abstract] ABSTRACT: This paper describes a low complexity video codec with high coding efficiency. It was proposed to the high efficiency video coding (HEVC) standardization effort of moving picture experts group and video coding experts group, and has been partially adopted into the initial HEVC test model under consideration design. The proposal utilizes a quadtree-based coding structure with support for macroblocks of size 64 × 64, 32 × 32, and 16 × 16 pixels. Entropy coding is performed using a low complexity variable length coding scheme with improved context adaptation compared to the context adaptive variable length coding design in H.264/AVC. The proposal's interpolation and deblocking filter designs improve coding efficiency, yet have low complexity. Finally, intra-picture coding methods have been improved to provide better subjective quality than H.264/AVC. The subjective quality of the proposed codec has been evaluated extensively within the HEVC project, with results indicating that similar visual quality to H.264/AVC High Profile anchors is achieved, measured by mean opinion score, using significantly fewer bits. Coding efficiency improvements are achieved with lower complexity than the H.264/AVC Baseline Profile, particularly suiting the proposal for high resolution, high quality applications in resource-constrained environments.
IEEE Transactions on Circuits and Systems for Video Technology 01/2011; · 1.82 Impact Factor
[show abstract][hide abstract] ABSTRACT: In this paper, an intra coding method is proposed aiming to efficiently code the smooth regions in the depth map images. One of the characteristics of typical depth map images is that they contain large gradually changing smooth areas. Traditional block based transform coding algorithms are causing blocking artifacts for this type of regions. Instead of using transform coding, the proposed method utilizes piecewise linear planar representations to code the smooth areas in the depth map picture. The piecewise linear representation for a block is obtained using the already decoded pixels neighboring the block and the block’s bottom-right sample. The experimental results show that significant compression improvement can be achieved with depth sequences containing smooth regions. By visual inspection, it is also shown that the blocking artifacts are significantly reduced.
[show abstract][hide abstract] ABSTRACT: In this paper, we propose a novel algorithm, named as Spatially Varying Transform (SVT). The basic idea of SVT is that we
do not restrict the transform coding inside normal block boundary but adjust it to the characteristics of the prediction error.
With this flexibility, we are able to achieve coding efficiency improvement by selecting and coding the best portion of the
prediction error in terms of rate distortion tradeoff. The proposed algorithm is implemented and studied in the H.264/AVC
framework. We show that the proposed algorithm achieves 2.64% bit-rate reduction compared to H.264/AVC on average over a wide
range of test set. Gains become more significant at high bit-rates and the bit-rate reduction can be up to 10.22%, which makes
the proposed algorithm very suitable for future video coding solutions focusing on high fidelity applications. The decoding
complexity is expected to be decreased because only a portion of the prediction error needs to be decoded.
IEEE Trans. Circuits Syst. Video Techn. 01/2011; 21:127-140.
[show abstract][hide abstract] ABSTRACT: Several quality evaluation studies have been performed for video-plus-depth coding systems. In these studies, however, the distortions in the synthesized views have been quantified in experimental setups where both the texture and depth videos are compressed. Nevertheless, there are several factors that affect the quality of the synthesized view. Incorporating more than one source of distortion in the study could be misleading; one source of distortion could mask (or be masked by) the effect of other sources of distortion. In this paper, we conduct a quality evaluation study that aims to assess the distortions introduced by the view synthesis procedure and depth map compression in multiview-video-plus-depth coding systems. We report important findings that many of the existing studies have overlooked, yet are essential to the reliability of quality evaluation. In particular, we show that the view synthesis reference software yields high distortions that mask those due to depth map compression, when the distortion is measured by average luma peak signal-to-noise ratio. In addition, we show what quality metric to use in order to reliably quantify the effect of depth map compression on view synthesis quality. Experimental results that support these findings are provided for both synthetic and real multiview-video-plus-depth sequences.
3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video (3DTV-CON), 2010; 07/2010
[show abstract][hide abstract] ABSTRACT: Directional Adaptive Interpolation Filtering (DAIF) is a novel interpolation technique that was proposed recently for hybrid video coding. It was reported, that this technique outperforms the standard H.264/AVC interpolation in terms of coding gain whereas requiring smaller number of arithmetic operations. In this publication we present an optimized implementation of DAIF on a modern computing platform exploiting the Single Instruction Multiple Data (SIMD) parallelism. In addition, we provide a complexity analysis in which the computational complexity is estimated as number of clock cycles per output sample. Proposed SIMD-based implementation of DAIF has lower or comparable interpolation complexity, compared to the highly optimized SIMD-based implementation of the H.264/AVC interpolations. Considering significantly better coding gain provided by DAIF, we believe this approach will play a significant role in future video coding standards.
Circuits and Systems (ISCAS), Proceedings of 2010 IEEE International Symposium on; 07/2010
[show abstract][hide abstract] ABSTRACT: In its Release 6, the Third Generation Partnership Project (3GPP) is defining a new service known as Multimedia Broadcast/Multicast (MBMS) that enables a number of new applications. Due to its nature, no feedback link from the receiver to the sender exists in MBMS. Hence no retrans- mission techniques can be employed to cope with the under- lying erroneous wireless channel. Instead, 3GPP is adopting a channel coding technique based on a Forward Error Cor- rection (FEC) scheme at the application layer. In this work, we are trying to find a good balance of source and channel coding to achieve the best video quality under MBMS con- ditions. We use a simulation environment that closely repre- sents the channel behaviour of the 3GPP wireless link and compare cases with different FEC overheads at different error rates. Experiments show that careful selection of FEC overhead yields to significantly better video quality.
[show abstract][hide abstract] ABSTRACT: This paper describes a low complexity video codec with high coding efficiency. It was proposed to the High Efficiency Video Coding (HEVC) standardization effort of MPEG and VCEG, and has been partially adopted into the initial HEVC Test Model under Consideration design. The proposal utilizes a quad-tree structure with a support of large macroblocks of size 64×64 and 32×32, in addition to macroblocks of size 16×16. The entropy coding is done using a low complexity variable length coding based scheme with improved context adaptation over the H.264/AVC design. In addition, the proposal includes improved interpolation and deblocking filters, giving better coding efficiency while having low complexity. Finally, an improved intra coding method is presented. The subjective quality of the proposal is evaluated extensively and the results show that the proposed method achieves similar visual quality as H.264/AVC High Profile anchors with around 50% and 35% bit rate reduction for low delay and random-access experiments respectively at high definition sequences. This is achieved with less complexity than H.264/AVC Baseline Profile, making the proposal especially suitable for resource constrained environments.
[show abstract][hide abstract] ABSTRACT: Two novel algorithms are proposed for improving the coding efficiency of adaptive interpolation schemes for video codecs, without increasing implementation complexity. Proposed algorithms utilize two different filter structures with equal tap length, but with complementary frequency responses. Depending on the content being coded, encoder selects which one of the two filter structure is optimal and signals this information to the decoder. In addition, the symmetry of filters is not pre-defined but is flexible. Encoder selects the optimal filter symmetry depending on the coding rate and the content and signals this information to the decoder. Experimental results show, that proposed improvements bring up to 7% of bit-rate reduction at high bit-rate over conventional adaptive interpolation. When compared to H.264/AVC, average gain over the test set is 11%. Coding efficiency is improved without increasing the complexity, thus proposed algorithms are suitable for mobile multimedia use-cases, where the computational resources are very limited.
Image Processing (ICIP), 2009 16th IEEE International Conference on; 12/2009
[show abstract][hide abstract] ABSTRACT: A novel adaptive interpolation filter structure for video coding with motion-compensated prediction is presented in this letter. The proposed scheme uses an independent directional adaptive interpolation filter for each sub-pixel location. The Wiener interpolation filter coefficients are computed analytically for each inter-coded frame at the encoder side and transmitted to the decoder. Experimental results show that the proposed method achieves up to 1.1 dB coding gain and a 15% average bit-rate reduction for high-resolution video materials compared to the standard nonadaptive interpolation scheme of H.264/AVC, while requiring 36% fewer arithmetic operations for interpolation. The proposed interpolation can be implemented in exactly 16-bit arithmetic, thus it can have important use-cases in mobile multimedia environments where the computational resources are severely constrained.
IEEE Transactions on Circuits and Systems for Video Technology 09/2009; · 1.82 Impact Factor
[show abstract][hide abstract] ABSTRACT: In our previous work, we introduced spatially varying transforms (SVT) for video coding, where the location of the transform block within the macroblock is not fixed but varying. SVT has lower decoding complexity compared to standard methods as only a portion of the prediction error needs to be decoded. However, the encoding complexity of SVT can be relatively high because of the need to perform rate distortion optimization (RDO) for each candidate location parameter (LP). In this work, we propose a low complexity algorithm operating on macroblock and block level to reduce the encoding complexity of SVT. The proposed low complexity algorithm includes selection of available candidate LP based on motion difference and a hierarchical search algorithm. Experimental results show that the proposed low complexity algorithm can reduce around 80% of the candidate LP tested in RDO with only marginal penalty in coding efficiency.
[show abstract][hide abstract] ABSTRACT: This paper presents novel methods for implementing adaptive interpolation filtering techniques for video coding on 16-bit integer architectures. The proposed methods offer significant complexity reduction over the state-of-the-art, making them especially valuable for resource constrained mobile multimedia devices, with high coding efficiency.
IEEE Transactions on Consumer Electronics 03/2009; · 1.09 Impact Factor
[show abstract][hide abstract] ABSTRACT: Multiview video has gained a wide interest recently. The huge amount of data needed to be processed by multiview applications is a heavy burden for both transmission and decoding. The joint video team has recently devoted part of its effort to extend the widely deployed H.264/AVC standard to handle multiview video coding (MVC). The MVC extension of H.264/AVC includes a number of new techniques for improved coding efficiency, reduced decoding complexity, and new functionalities for multiview operations. MVC takes advantage of some of the interfaces and transport mechanisms introduced for the scalable video coding (SVC) extension of H.264/AVC, but the system level integration of MVC is conceptually more challenging as the decoder output may contain more than one view and can consist of any combination of the views with any temporal level. The generation of all the output views also requires careful consideration and control of the available decoder resources. In this paper, multiview applications and solutions to support generic multiview as well as 3D services are introduced. The proposed solutions, which have been adopted to the draft MVC specification, cover a wide range of requirements for 3D video related to interface, transport of the MVC bitstreams, and MVC decoder resource management. The features that have been introduced in MVC to support these solutions include marking of reference pictures, supporting for efficient view switching, structuring of the bitstream, signalling of view scalability supplemental enhancement information (SEI) and parallel decoding SEI.
[show abstract][hide abstract] ABSTRACT: In our previous work, we introduced Spatially Varying Transforms (SVT) for video coding, where the location of the transform block within the macroblock is not fixed but varying. In this paper, we extend this concept and present a novel method, called Variable Block-size Spatially Varying Transforms (VBSVT). VBSVT utilizes Variable Block-size Transforms (VBT) in the SVT framework, and is shown to be more preferable for coding prediction error with different characteristics than fixed block-size SVT and also the stan- dard methods that use fixed or adaptive block sizes at fixed spatial locations. In addition, VBSVT has similar decoding complexity with fixed block-size SVT and lower decoding complexity compared to standard methods as only a portion of the prediction error needs to be decoded. Experimental results show that, VBSVT achieves 4.1% gain over H.264/AVC on average over a wide range of test set. Gains become more significant at high quality levels and go up to 13.5%, which makes the proposed algorithm very suitable for future video coding solutions focusing on high fidelity applications.
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2009, 19-24 April 2009, Taipei, Taiwan; 01/2009 · 1.82 Impact Factor
[show abstract][hide abstract] ABSTRACT: In order to compensate for the temporally changing effect of aliasing and improve the coding efficiency of video coders, adaptive interpolation filtering schemes have been recently proposed. In such schemes, encoder computes the interpolation filter coefficients for each frame and then re-encodes the frame with the new adaptive filter. However, the coding efficiency benefit comes with the expense of increased encoding complexity due to this additional encoding pass. In this paper, we present two novel algorithms to reduce the encoding complexity of adaptive interpolation filtering schemes. First algorithm reduces the complexity of the second encoding pass by using a very lightweight motion estimation algorithm that reuses the data already computed in the first encoding pass. Second algorithm eliminates the second coding pass and re-uses the filter coefficients already computed for previous frames. Experimental results show that the proposed methods achieve between 1.5 to 2 times encoding complexity reduction with practically negligible penalty on coding efficiency.
Multimedia Signal Processing, 2008 IEEE 10th Workshop on; 11/2008