Multiview Video Coding Using View Interpolation and Color Correction

Universal Media Res. Center, Tokyo
IEEE Transactions on Circuits and Systems for Video Technology (Impact Factor: 1.82). 12/2007; DOI: 10.1109/TCSVT.2007.903802
Source: IEEE Xplore

ABSTRACT Neighboring views must be highly correlated in multiview video systems. We should therefore use various neighboring views to efficiently compress videos. There are many approaches to doing this. However, most of these treat pictures of other views in the same way as they treat pictures of the current view, i.e., pictures of other views are used as reference pictures (inter-view prediction). We introduce two approaches to improving compression efficiency in this paper. The first is by synthesizing pictures at a given time and a given position by using view interpolation and using them as reference pictures (view-interpolation prediction). In other words, we tried to compensate for geometry to obtain precise predictions. The second approach is to correct the luminance and chrominance of other views by using lookup tables to compensate for photoelectric variations in individual cameras. We implemented these ideas in H.264/AVC with inter-view prediction and confirmed that they worked well. The experimental results revealed that these ideas can reduce the number of generated bits by approximately 15% without loss of PSNR.

1 Bookmark
  • [Show abstract] [Hide abstract]
    ABSTRACT: In general, excessive colorimetric and geometric errors in multi-view images induce visual fatigue to users. Various works have been proposed to reduce these errors, but conventional works have only been available for stereoscopic images while requiring cumbersome additional tasks, and often showing unstable results. In this paper, we propose an effective multi-view image refinement algorithm. The proposed algorithm analyzes such errors in multi-view images from sparse correspondences and compensates them automatically. While the conventional works transform every view to compensate geometric errors, the proposed method transforms only the source views with consideration of a reference view. Therefore this approach can be extended regardless of the number of views. In addition, we also employ uniform view intervals to provide consistent depth perception among views. We correct color inconsistency among views from the correspondences by considering importance and channel properties. Various experimental results show that the proposed algorithm outperforms conventional approaches and generates more visually comfortable multi-view images.
    Journal of Visual Communication and Image Representation 01/2014; 25(4):698–708. · 1.20 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: In the applications of Free View TV, pre-estimated depth information is available to synthesize the intermediate views as well as to assist multi-view video coding. Existing view synthesis prediction schemes generate virtual view picture only from interview pictures. However, there are many types of signal mismatches caused by depth errors, camera heterogeneity or illumination difference across views and these mismatches decrease the prediction capability of virtual view picture. In this paper, we propose an adaptive learning based view synthesis prediction algorithm to enhance the prediction capability of virtual view picture. This algorithm integrates least square prediction with backward warping to synthesize the virtual view picture, which not only utilizes the adjacent views information but also the temporal decoded information to adaptively learn the prediction coefficients. Experiments show that the proposed method reduces the bitrates by up to 18 % relative to the multi-view video coding standard, and about 11 % relative to the conventional view synthesis prediction method.
    Journal of Signal Processing Systems 01/2014; · 0.55 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Current research towards 3D video compression within MPEG requires the compression of three texture and depth views. To reduce the additional complexity and bit rate of the depth map encoding, we present a fast mode decision model based on previously encoded macroblocks of the texture view. Meanwhile we present techniques to reduce the rate based on predicting syntax elements from the corresponding texture view. The proposed system is able to get a reduction in complexity of 71.08% with an average bit rate gain of 4.35%.
    Consumer Electronics (ICCE), 2013 IEEE International Conference on; 01/2013