-
ICIP'13 (Melbourne, Australia, 2013), Proc., to appear.; 01/2013
-
[show abstract]
[hide abstract]
ABSTRACT: A unified optimization framework for rate allocation among multiple video multicast sessions sharing a wireless network is presented. Our framework applies to delivery of both scalable and non-scalable video streams. In both cases, the optimization objective is to minimize the total video distortion of all peers without incurring excessive network utilization. Our system model explicitly accounts for heterogeneity in wireless link capacities, traffic contention among neighboring links, as well as different video rate-distortion (RD) characteristics. The proposed distributed rate allocation scheme leverages cross-layer information exchange between the media access control and application layers to achieve fast convergence at the optimal, media-aware allocation. Performance of the proposed media-aware rate allocation protocol is compared against a heuristic scheme based on TCP-friendly rate control (TFRC). In network simulations of standard-definition video streaming over single or multiple multicast trees, the proposed scheme consistently achieves higher overall video quality than the TFRC-based heuristics. When delivering scalable streams, the flexibility of per-peer rate adaptation inside each multicast tree yields a further slight improvement in overall video quality over multicast of non-scalable streams.
IEEE Transactions on Circuits and Systems for Video Technology 10/2011; · 1.65 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Quadtree-based block partitioning together with motion-compensated prediction has proven to be an efficient approach in video compression. However, when dealing with spatially neighboring blocks in uniformly displaced regions, quadtree-based partitioning may lead to redundant sets of transmitted motion parameters. This paper proposes and describes a simple but efficient block merging algorithm that aims at removing those redundancies by using only a single parameter set for a whole motion-compensated region of contiguous blocks. Simulation results show that our proposed merging technique works more efficiently than the conceptually similar direct mode as, e.g., specified in H.264/AVC. Due its efficiency and simplicity, our proposed merging approach has been adopted into the first test model of the high efficiency video coding (HEVC) standardization project, as currently pursued by ITU-T VCEG and ISO/IEC MPEG.
Multimedia and Expo (ICME), 2011 IEEE International Conference on; 08/2011
-
[show abstract]
[hide abstract]
ABSTRACT: The bitstream structure of layered media formats such as scalable video coding (SVC) or multiview video coding (MVC) opens up new opportunities for their distribution in Mobile TV services. Features like graceful degradation or the support of the 3-D experience in a backwards-compatible way are enabled. The reason is that parts of the media stream are more important than others with each part itself providing a useful media representation. Typically, the decoding of some parts of the bitstream is only possible, if the corresponding more important parts are correctly received. Hence, unequal error protection (UEP) can be applied protecting important parts of the bitstream more strongly than others. Mobile broadcast systems typically apply forward error correction (FEC) on upper layers to cope with transmission errors, which the physical layer FEC cannot correct. Today's FEC solutions are optimized to transmit single layer video. The exploitation of the dependencies in layered media codecs for UEP using FEC is the subject of this paper. The presented scheme, which is called layer-aware FEC (LA-FEC), incorporates the dependencies of the layered video codec into the FEC code construction. A combinatorial analysis is derived to show the potential theoretical gain in terms of FEC decoding probability and video quality. Furthermore, the implementation of LA-FEC as an extension of the Raptor FEC and the related signaling are described. The performance of layer-aware Raptor code with SVC is shown by experimental results in a DVB-H environment showing significant improvements achieved by LA-FEC.
IEEE Transactions on Multimedia 07/2011; · 1.93 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: A depth image-based rendering (DIBR) approach with advanced inpainting methods is presented. The DIBR algorithm can be used in 3-D video applications to synthesize a number of different perspectives of the same scene, e.g., from a multiview-video-plus-depth (MVD) representation. This MVD format consists of video and depth sequences for a limited number of original camera views of the same natural scene. Here, DIBR methods allow the computation of additional new views. An inherent problem of the view synthesis concept is the fact that image information which is occluded in the original views may become visible, especially in extrapolated views beyond the viewing range of the original cameras. The presented algorithm synthesizes these occluded textures. The synthesizer achieves visually satisfying results by taking spatial and temporal consistency measures into account. Detailed experiments show significant objective and subjective gains of the proposed method in comparison to the state-of-the-art methods.
IEEE Transactions on Multimedia 07/2011; · 1.93 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: View synthesis algorithms suitable for mobile devices are presented and discussed. The focus is on low complexity rendering methods that enable line wise processing. A full pixel accurate rendering algorithm is compared to two algorithms using sub pixel accuracy: One sub pixel accurate method applies a linear interpolation, while the other method utilizes rendering in an up-sampled domain. An analysis of the rendered views and the corresponding stereo views is given. In the evaluation especially aspects on 3D impression, hole filling in disoccluded areas and flickering were investigated. Finally, optimized versions of the evaluated algorithms are presented and their computational complexities are discussed.
3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video (3DTV-CON), 2011; 06/2011
-
[show abstract]
[hide abstract]
ABSTRACT: Current 3-D video (3DV) technology is based on stereo systems. These systems use stereo video coding for pictures delivered by two input cameras. Typically, such stereo systems only reproduce these two camera views at the receiver and stereoscopic displays for multiple viewers require wearing special 3-D glasses. On the other hand, emerging autostereoscopic multiview displays emit a large numbers of views to enable 3-D viewing for multiple users without requiring 3-D glasses. For representing a large number of views, a multiview extension of stereo video coding is used, typically requiring a bit rate that is proportional to the number of views. However, since the quality improvement of multiview displays will be governed by an increase of emitted views, a format is needed that allows the generation of arbitrary numbers of views with the transmission bit rate being constant. Such a format is the combination of video signals and associated depth maps. The depth maps provide disparities associated with every sample of the video signal that can be used to render arbitrary numbers of additional views via view synthesis. This paper describes efficient coding methods for video and depth data. For the generation of views, synthesis methods are presented, which mitigate errors from depth estimation and coding.
Proceedings of the IEEE 05/2011; · 6.81 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Significant improvements in video compression capability have been demonstrated with the introduction of the H.264/MPEG-4 advanced video coding (AVC) standard. Since developing this standard, the Joint Video Team of the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG) has also standardized an extension of that technology that is referred to as multiview video coding (MVC). MVC provides a compact representation for multiple views of a video scene, such as multiple synchronized video cameras. Stereo-paired video for 3-D viewing is an important special case of MVC. The standard enables inter-view prediction to improve compression capability, as well as supporting ordinary temporal and spatial prediction. It also supports backward compatibility with existing legacy systems by structuring the MVC bitstream to include a compatible “base view.” Each other view is encoded at the same picture resolution as the base view. In recognition of its high-quality encoding capability and support for backward compatibility, the stereo high profile of the MVC extension was selected by the Blu-Ray Disc Association as the coding format for 3-D video with high-definition resolution. This paper provides an overview of the algorithmic design used for extending H.264/MPEG-4 AVC towards MVC. The basic approach of MVC for enabling inter-view prediction and view scalability in the context of H.264/MPEG-4 AVC is reviewed. Related supplemental enhancement information (SEI) metadata is also described. Various “frame compatible” approaches for support of stereo-view video as an alternative to MVC are also discussed. A summary of the coding performance achieved by MVC for both stereo- and multiview video is also provided. Future directions and challenges related to 3-D video are also briefly discussed.
Proceedings of the IEEE 05/2011; · 6.81 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: HTTP-based delivery for Video on Demand (VoD) has been gaining popularity within recent years. Progressive Download over HTTP, typically used in VoD, takes advantage of the widely deployed network caches to release video servers from sending the same content to a high number of users in the same VoD service. However, due to the inherent heterogeneity of user demands, which may result in requesting the same video content in different resolutions or qualities, the caching efficiency is expected to decrease due to a higher variety in requested media files. The use of Scalable Video Coding allows different representations of the same content to be combined in a single file, whose parts, aka layers, are requested sequentially by a user up to the maximum desired quality. In this paper we show the benefits of using Scalable Video Coding to maintain the same set of possible video content representations, while at the same time maximizing the caching efficiency.
Consumer Communications and Networking Conference (CCNC), 2011 IEEE; 02/2011
-
[show abstract]
[hide abstract]
ABSTRACT: Typical interpolation methods in video coding perform filtering of reference picture samples using FIR filters for motion-compensated prediction. This process can be viewed as a signal decomposition using basis functions which are restricted by the interpolating constraint. Using the concept of generalized interpolation provides a greater degree of freedom for selecting basis functions. We implemented generalized interpolation using a combination of IIR and FIR filters. The complexity of the proposed scheme is comparable to that of an 8-tap FIR filter. Bit rate savings up to 20% compared to the H.264/AVC 6-tap filter are shown.
Picture Coding Symposium (PCS), 2010; 01/2011
-
[show abstract]
[hide abstract]
ABSTRACT: In this paper, novel intra prediction methods based on image inpainting approaches are proposed. The H.264/AVC intra prediction modes are not well suited for processing complex textures at low bit rates. Our algorithm utilizes an efficient combination of partial differential equations (PDEs) and patch-based texture synthesis in addition to the standard directional predictors. Bit rate savings up to 3.5% compared to that of the H.264/AVC standard are shown.
Picture Coding Symposium (PCS), 2010; 01/2011
-
D. Marpe,
H. Schwarz,
S. Bosse,
B. Bross,
P. Helle,
T. Hinz,
H. Kirchhoffer,
H. Lakshman,
Tung Nguyen,
S. Oudin,
M. Siekmann,
K. Sühring,
M. Winken, T. Wiegand
[show abstract]
[hide abstract]
ABSTRACT: This paper describes a novel video coding scheme that can be considered as a generalization of the block-based hybrid video coding approach of H.264/AVC. While the individual building blocks of our approach are kept simple similarly as in H.264/AVC, the flexibility of the block partitioning for prediction and transform coding has been substantially increased. This is achieved by the use of nested and pre-configurable quadtree structures, such that the block partitioning for temporal and spatial prediction as well as the space-frequency resolution of the corresponding prediction residual can be adapted to the given video signal in a highly flexible way. In addition, techniques for an improved motion representation as well as a novel entropy coding concept are included. The presented video codec was submitted to a Call for Proposals of ITU-T VCEG and ISO/IEC MPEG and was ranked among the five best performing proposals, both in terms of subjective and objective quality.
Picture Coding Symposium (PCS), 2010; 01/2011
-
[show abstract]
[hide abstract]
ABSTRACT: Video transmission over error prone channels, such as typical for Mobile TV or IPTV systems, is constantly subject to research. Simulation is an important instrument to evaluate performance of the overall system, but the multitude of parameters often requires large and time-consuming simulation sets. In this paper, we present a mechanism for fast evaluation of error-prone H.264/AVC and SVC video transmission with application-level metrics. Our approach significantly reduces overall simulation time by eliminating redundancy in the evaluation phase and utilizing the prediction structure of H.264/AVC and SVC. The benefit of the presented approach is evaluated with an exemplary simulation setup of a Mobile TV scenario via DVB-H.
Computer Aided Modeling, Analysis and Design of Communication Links and Networks (CAMAD), 2010 15th IEEE International Workshop on; 01/2011
-
[show abstract]
[hide abstract]
ABSTRACT: The five papers in this special section were among those submitted in response to the joint call for proposals on high efficiency video coding (HEVC) standardization. Although at this point of development it is still unclear which specific elements the final HEVC standard will contain, the selection of the papers was made such that together they would cover most of the promising tools and technologies that seem likely to be included in the standard.
IEEE Transactions on Circuits and Systems for Video Technology 01/2011; · 1.65 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: It is a well-known fact that, in order to overcome annoying blocking artifacts, transforms with block-overlapping basis functions have been proposed for image coding. Typically in transform coding, the encoder determines the transform coefficient values by applying the forward transform followed by scalar quantization. In this paper we present an approach, how rate-distortion optimized Lapped Biorthogonal Transform (LBT) coefficient values can be determined by solving ℓ<sub>1</sub>-regularized least squares problems. We compare a global version, where all the transform coefficients are obtained in one single optimization step, and a local version, where the optimization is done separately for each block, which results in losing optimality, but achieving highly reduced complexity. Our simulation results show gains of about 0.5 dB PSNR compared to ordinary forward transform and scalar quantization with only small losses (<; 0.1 dB) for the local variant.
Image Processing (ICIP), 2010 17th IEEE International Conference on; 10/2010
-
[show abstract]
[hide abstract]
ABSTRACT: This paper introduces a correlation histogram method for analyzing the different components of depth-enhanced 3D video representations. Depth-enhanced 3D representations such as multi-view video plus depth consist of two components: video and depth map sequences. As depth maps represent the scene geometry, their characteristics differ from the video data. We present a comparative analysis that identifies the significant characteristics of the two components via correlation histograms. These characteristics are of special importance for compression. Modern video codecs like H.264/AVC are highly optimized to the statistical properties of natural video. Therefore the effect of compressing the two components using the MVC extension of H.264/AVC is evaluated in the second part of the analysis. The presented results show that correlation histograms are a powerful and well-suited method for analyzing the impact of processing on the characteristics of depth-enhanced 3D video.
Image Processing (ICIP), 2010 17th IEEE International Conference on; 10/2010
-
[show abstract]
[hide abstract]
ABSTRACT: P2P-streaming has become of high interest in the last years, since it reduces the load on expensive servers, due to the participation of receivers in the media transmission. In this paper, P2P content delivery is shown to be a promising technique for video group communication, for which the main requirement is low-delay. Combining low-delay encoding and low-delay P2P Application Layer Multicast makes it possible to fulfill the delay constraints for interactive group communication applications. For such an application, congestion is a considerable problem, since it causes packet loss or late arrival of the packets, degrading the quality of the service. The results presented in this paper show how rate adaptation in combination with the Scalable Video Coding (SVC) helps to overcome problems in the network, providing a better solution than when non-adaptive single layer coding is transmitted.
Image Processing (ICIP), 2010 17th IEEE International Conference on; 10/2010
-
[show abstract]
[hide abstract]
ABSTRACT: The introduction of first 3D systems for digital cinema and home entertainment is based on stereo technology. For efficiently supporting new display types, depth-enhanced formats and coding technology is required, as introduced in this overview paper. First, we discuss the necessity for a generic 3D video format, as the current state-of-the-art in multi-view video coding cannot support different types of multi-view displays at the same time. Therefore, a generic depth-enhanced 3D format is developed, where any number of views can be generated from one bit stream. This, however, requires a complex framework for 3D video, where not only the 3D format and new coding methods are investigated, but also view synthesis and the provision of high-quality depth maps, e.g. via depth estimation. We present this framework and discuss the interdependencies between the different modules.
Image Processing (ICIP), 2010 17th IEEE International Conference on; 10/2010
-
[show abstract]
[hide abstract]
ABSTRACT: Depth-image-based rendering (DIBR) is used to generate additional views of a real-world scene from images or videos and associated per-pixel depth information. An inherent problem of the view synthesis concept is the fact that image information which is occluded in the original view may become visible in the “virtual” image. The resulting question is: how can these disocclusions be covered in a visually plausible manner? In this paper, a new temporally and spatially consistent hole filling method for DIBR is presented. In a first step, disocclusions in the depth map are filled. Then, a background sprite is generated and updated with every frame using the original and synthesized information from previous frames to achieve temporally consistent results. Next, small holes resulting from depth estimation inaccuracies are closed in the textured image, using methods that are based on solving Laplace equations. The residual disoccluded areas are coarsely initialized and subsequently refined by patch-based texture synthesis. Experimental results are presented, highlighting that gains in objective and visual quality can be achieved in comparison to the latest MPEG view synthesis reference software (VSRS).
Image Processing (ICIP), 2010 17th IEEE International Conference on; 10/2010
-
[show abstract]
[hide abstract]
ABSTRACT: A block based video coder that supports multiple motion models is proposed. Apart from the typical translational motion model, we employ parametric models to more accurately represent complex motions that occur in video sequences. A novel method for estimating the warping parameters in a rate-constrained way is presented. A cubic spline framework is utilized to obtain fractional-accuracy samples for motion compensation. Efficient motion vector prediction schemes are developed to maintain the continuity of the predictor in spite of different motion models. Bit rate savings up to 11.7% in IPPP and 9.3% in Hierarchical B pictures are shown compared to an improved H.264/AVC reference.
Image Processing (ICIP), 2010 17th IEEE International Conference on; 10/2010