[Show abstract][Hide abstract] ABSTRACT: This paper proposes an image interpolation algorithm exploiting sparse
representation for natural images. It involves three main steps: (a) obtaining
an initial estimate of the high resolution image using linear methods like FIR
filtering, (b) promoting sparsity in a selected dictionary through iterative
thresholding, and (c) extracting high frequency information from the
approximation to refine the initial estimate. For the sparse modeling, a
shearlet dictionary is chosen to yield a multiscale directional representation.
The proposed algorithm is compared to several state-of-the-art methods to
assess its objective as well as subjective performance. Compared to the cubic
spline interpolation method, an average PSNR gain of around 0.8 dB is observed
over a dataset of 200 images.
Signal Processing Image Communication 05/2015; DOI:10.1016/j.image.2015.06.004 · 1.46 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Layered video coding creates multiple layers of unequal importance, which enables us to progressively refine the reconstructed video quality. When the base layer (BL) is corrupted or lost during transmission, the enhancement layers (ELs) must be dropped, regardless whether they are perfectly decoded or not, which implies that the transmission power assigned to the ELs is wasted. For the sake of combating this problem, the class of inter-layer forward error correction (IL-FEC) solutions, also referred to as layer-aware FEC (LA-FEC),1has been proposed for layered video transmissions, which jointly encode the BL and the ELs, thereby protecting the BL using the ELs. This tutorial aims for inspiring further research on IL-FEC/LA-FEC techniques, with special emphasis on the family of soft-decoded bit-level IL- FEC schemes.
[Show abstract][Hide abstract] ABSTRACT: Recent studies exploit the neural signal recorded via electroencephalography (EEG) to get a more objective measurement of perceived video quality. Most of these studies capitalize on the event-related potential component P3. We follow an alternative approach to the measurement problem investigating steady state visual evoked potentials (SSVEPs) as EEG correlates of quality changes. Unlike the P3, SSVEPs are directly linked to the sensory processing of the stimuli and do not require long experimental sessions to get a sufficient signal-to-noise ratio. Furthermore, we investigate the correlation of the EEG-based measures with the outcome of the standard behavioral assessment.
As stimulus material, we used six gray-level natural images in six levels of degradation that were created by coding the images with the HM10.0 test model of the high efficiency video coding (H.265/MPEG-HEVC) using six different compression rates. The degraded images were presented in rapid alternation with the original images. In this setting, the presence of SSVEPs is a neural marker that objectively indicates the neural processing of the quality changes that are induced by the video coding. We tested two different machine learning methods to classify such potentials based on the modulation of the brain rhythm and on time-locked components, respectively.
Results show high accuracies in classification of the neural signal over the threshold of the perception of the quality changes. Accuracies significantly correlate with the mean opinion scores given by the participants in the standardized degradation category rating quality assessment of the same group of images.
The results show that neural assessment of video quality based on SSVEPs is a viable complement of the behavioral one and a significantly fast alternative to methods based on the P3 component.
[Show abstract][Hide abstract] ABSTRACT: In recent years, significant progress has been witnessed in several image and video completion scenarios. Given a specific application, these methods can produce, reproduce or extend a given texture sample. While there are many promising algorithms available, there is still a lack of theoretical understanding on how some of them are designed and under which conditions they perform. For that, we analyze and describe the technique behind one of the most popular parametric completion algorithms: the autoregressive (AR) model. Furthermore, we address important implementation details, complexity issues and restrictions of the model. Beyond that, we explain how the performance of the AR model can be significantly improved. In summary, this paper aims to achieve three major goals: (1) to provide a comprehensive tutorial for experienced and non- experienced readers, (2) to propose novel methods that improve the performance of the 2D-AR completion, and (3) to motivate and guide researchers that are interested in the usage of the AR model for texture completion tasks.
Signal Processing Image Communication 02/2015; 32. DOI:10.1016/j.image.2015.01.010 · 1.46 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Conventionally, the quality of images and related codecs are assessed using subjective tests, such as Degradation Category Rating. These quality assessments consider the behavioral level only. Recently, it has been proposed to complement this approach by investigating how quality is processed in the brain of a user (using electroencephalography, EEG), potentially leading to results that are less biased by subjective factors. In this paper, a novel method is presented for assessing how image quality is processed on a neural level, using Steady-State Visual Evoked Potentials (SSVEPs) as EEG features. We tested our approach in an EEG study with 16 participants who were presented with distorted images of natural textures. Subsequently, we compared our approach analogously to the standardized Degradation Category Rating quality assessment. Remarkably, our novel method yields a correlation of r = 0.93 to MOS on the recorded dataset.
[Show abstract][Hide abstract] ABSTRACT: The work on support for higher bit depths and 4:2:2 as well as 4:4:4 chroma sampling formats for the High Efficiency Video Coding (HEVC) standard is currently being conducted under the term Range Extensions (RExt). A technique that exploits the correlation between residual color components in 4:4:4 chroma sampling format, also referred to as Cross-Component Prediction (CCP), has been adopted as part of the current RExt draft. In this paper, this relatively simple but yet effective CCP scheme is presented. Conceptually, CCP relies on the idea that an adaptively switched predictor based on a linear model is invoked for coding of the residuals of the second and third color component by using the residual of the first color component. Experimental results show that, depending on the underlying color space, average bit-rate savings in the range of 2-18% or 3-26% can be achieved by CCP for test sets of natural and screen content, respectively.
[Show abstract][Hide abstract] ABSTRACT: Delay-sensitive media applications typically prioritise timeliness over reliability, therefore preferring UDP over TCP. Retransmission is a method to compensate for packet loss and requires the receiver to provide timely feedback to the sender. Delaying the retransmission request too long may result in the retransmitted media arriving late. Alternatively, aggressive error estimation, where slightly delayed packets are seen as lost, results in unnecessary bandwidth usage and may contribute to further congestion of the network. We study receiver-based retransmission timeout (RTO) estimation in the context of real-time streaming over Multipath RTP and propose a solution in which we use statistical methods to provide accurate RTO prediction which allows for timely feedback. The proposed approach allows the receiver to accurately estimate the RTO when receiving media over multiple paths irrespective of the scheduling algorithm used at the sender. This enables a sender to take advantage of multiple paths for load balancing or bandwidth aggregation by scheduling media based on dynamic path characteristics.
2014 28th International Conference on Advanced Information Networking and Applications Workshops (WAINA); 05/2014
[Show abstract][Hide abstract] ABSTRACT: Real-time video applications, such as video conferencing and video surveillance systems, typically involve the simultaneous transport of multiple video sources to interested parties that consume the content. It may be desirable to mix these multiple source videos into a single video stream at intermediary nodes in the network. This has the advantage of reduced application and transport complexity on the client device and also makes it possible for devices with a single hardware decoder to consume the content. A typical approach is to apply transcoding operations to the original videos, i.e. the videos are decoded, merged and encoded into a single video stream. This paper proposes an alternative solution to video transcoding, which uses the new video coding standard HEVC and has a much lower processing complexity. We consider how our approach can be realized in real-world applications such as a cloud video mixer. Such systems typically require some degree of dynamics and personalization and we provide some insight into how transport signaling complexities can be addressed.
2014 IEEE 11th Consumer Communications and Networking Conference (CCNC); 01/2014
[Show abstract][Hide abstract] ABSTRACT: In this paper we present a new local and multiscale disparity estimation algorithm. Results show that the proposed method can preserve arbitrarily shaped depth discontinuities. Some results obtained from this method are shown as depth maps and synthetic views produced using the estimated disparity maps.
[Show abstract][Hide abstract] ABSTRACT: High Efficiency Video Coding (HEVC) is the most recent jointly developed video coding standard of ITU-T Visual Coding Experts Group (VCEG) and ISO/IEC Moving Picture Experts Group (MPEG). Although its basic architecture is built along the conventional hybrid block-based approach of combining prediction with transform coding, HEVC includes a number of coding tools with greatly enhanced coding-efficiency capabilities relative to those of prior video coding standards. Among these tools are new transform coding techniques that include the support for dyadically increasing transform block sizes ranging from 4 × 4 to 32 × 32, the partitioning of residual blocks into variable block-size transforms by using a quadtree-based partitioning dubbed as residual quadtree (RQT) as well as some properly designed entropy coding techniques for quantized transform coefficients of variable transform block sizes. In this paper, we describe these HEVC techniques for transform coding with a particular focus on the RQT structure and the entropy coding stage and demonstrate their benefit in terms of improved coding efficiency by experimental results.
IEEE Journal of Selected Topics in Signal Processing 12/2013; 7(6):978-989. DOI:10.1109/JSTSP.2013.2278071 · 2.37 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The high bit rates of high-definition or 3D-services require a huge share of the valuable terrestrial spectrum, especially when targeting wide coverage areas. This article describes how to provide future services with the state-of-the-art digital terrestrial TV technology DVB-T2 in a flexible and cost-efficient way. The combination of layered media such as the scalable and 3D extension of the H.264/AVC or emerging H.265/HEVC format with the physical layer pipes feature of DVB-T2 enables flexible broadcast of services with differentiated protection of the quality layers. This opens up new ways of service provisioning such as graceful degradation for mobile or fixed reception. This article shows how existing DVB-T2 and MPEG-2 transport stream mechanisms need to be configured for offering such services over DVB-T2. A detailed description of the setup of such services and the involved components is given.
[Show abstract][Hide abstract] ABSTRACT: This paper proposes an image interpolation algorithm exploiting sparse representation for natural images. It involves three steps: (a) obtaining an initial estimate of the high resolution image using linear methods like FIR filtering, (b) promoting sparsity in a selected dictionary through thresholding and (c) extracting high frequency information from the approximation and adding it to the initial estimate. For the sparse modeling, a shearlet dictionary is chosen to yield a multi-scale directional representation. The proposed algorithm is compared to several state-of-the-art methods to assess its objective as well as subjective performance. Compared to the cubic spline interpolation method, an average PSNR gain of around 0.7 dB is observed over a dataset of 200 images.
ICIP'13 (Melbourne, Australia, 2013), Proc., to appear.; 09/2013
[Show abstract][Hide abstract] ABSTRACT: The paper describes an extension of the high efficiency video coding (HEVC) standard for coding of multi-view video and depth data. In addition to the known concept of disparity-compensated prediction, inter-view motion parameter and inter-view residual prediction for coding of the dependent video views have been developed and integrated. Furthermore, for depth coding, new intra coding modes, a modified motion compensation and motion vector coding as well as the concept of motion parameter inheritance are part of the HEVC extension. A novel encoder control uses view synthesis optimization, which guarantees that high quality intermediate views can be generated based on the decoded data. The bitstream format supports the extraction of partial bitstreams, so that conventional 2D video, stereo video and the full multi-view video plus depth (MVD) format can be decoded from a single bitstream. Objective and subjective results are presented, demonstrating that the proposed approach provides about 50% bit rate savings in comparison to HEVC simulcast and about 20% in comparison to a straightforward multi-view extension of HEVC without the newly developed coding tools.
[Show abstract][Hide abstract] ABSTRACT: Fractional sample interpolation with finite impulse response (FIR) filters is commonly used for motion-compensated prediction (MCP). The FIR filtering can be viewed as a signal decomposition using restricted basis functions. The concept of generalized interpolation provides a greater degree of freedom for selecting basis functions. We developed a generalized interpolation framework for MCP using fixed-point infinite impulse response and FIR filters. An efficient multiplication-free design of the algorithm that is suited for hardware implementation is shown. A detailed analysis of average and worst case complexities compared to FIR filter-based interpolation techniques is provided. Average bitrate savings of around 2.0% compared to an 8-tap FIR filter are observed over the high-efficiency video coding dataset at a similar worst case complexity.
IEEE Transactions on Circuits and Systems for Video Technology 03/2013; 23(3):455-466. DOI:10.1109/TCSVT.2012.2207273 · 2.62 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Intra prediction is a fundamental tool in video coding with hybrid
block-based architecture. Recent investigations have shown that one of
the most beneficial elements for a higher compression performance in
high-resolution videos is the incorporation of larger block structures.
Thus in this work, we investigate the performance of novel intra
prediction modes based on different image completion techniques in a new
video coding scheme with large block structures. Image completion
methods exploit the fact that high frequency image regions yield high
coding costs when using classical H.264/AVC prediction modes. This
problem is tackled by investigating the incorporation of several intra
predictors using the concept of Laplace partial differential equation
(PDE), Least Square (LS) based linear prediction and the Auto Regressive
model. A major aspect of this article is the evaluation of the coding
performance in a qualitative (i.e. coding efficiency) manner.
Experimental results show significant improvements in compression (up to
7.41 %) by integrating the LS-based linear intra prediction.
Proceedings of SPIE - The International Society for Optical Engineering 02/2013; 8666. DOI:10.1117/12.2007942 · 0.20 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: This paper describes an extension of the upcoming High Efficiency Video
Coding (HEVC) standard for supporting spatial and quality scalable video
coding. Besides scalable coding tools known from scalable profiles of
prior video coding standards such as H.262/MPEG-2 Video and H.264/MPEG-4
AVC, the proposed scalable HEVC extension includes new coding tools that
further improve the coding efficiency of the enhancement layer. In
particular, new coding modes by which base and enhancement layer signals
are combined for forming an improved enhancement layer prediction signal
have been added. All scalable coding tools have been integrated in a way
that the low-level syntax and decoding process of HEVC remain unchanged
to a large extent. Simulation results for typical application scenarios
demonstrate the effectiveness of the proposed design. For spatial and
quality scalable coding with two layers, bit-rate savings of about
20-30% have been measured relative to simulcasting the layers, which
corresponds to a bit-rate overhead of about 5-15% relative to
single-layer coding of the enhancement layer.
Proceedings of SPIE - The International Society for Optical Engineering 02/2013; 8666:05-. DOI:10.1117/12.2009393 · 0.20 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Global Internet traffic shows an upward trend mainly driven by the increasing demand for video services. In addition the further spread of mobile Internet leads to an increased diversification of access data rates and internet terminals. In such a context, Content Delivery Networks (CDNs) are forced to offer content in multiple versions for different resolutions. Moreover multiple bitrates are needed, such that emerging adaptive streaming technologies are enabled to adapt to network congestion. This enormous proliferation of the multimedia content becomes more and more a challenge for the efficiency of existing network and caching infrastructure. Dynamic Adaptive Streaming over HTTP (DASH) is an emerging standard which enables adaptation of the media bitrate to varying throughput conditions by offering multiple representations of the same content. The combination of Scalable Video Coding (SVC) with DASH, called improved DASH (iDASH) consists basically of relying on SVC to provide the different representations. This paper shows how prioritized caching strategies can improve the caching performance of (i)DASH services. Results obtained from statistics of a real world CDN deployment and a simple revenue model show a clear benefit in revenue for content providers when priority caching is used especially in combination with iDASH.
The Era of Interactive Media, 01/2013: pages 643-655; , ISBN: 978-1-4614-3500-6