K. Kamikura

NTT DATA Corporation, Edo, Tōkyō, Japan

Are you K. Kamikura?

Claim your profile

Publications (29)10.23 Total impact

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose an MPEG-2 to H.264 transcoding method for interlace streams intermingled with frame and field macroblocks. This method uses the encoding information from an MPEG-2 stream and keeps as many DCT coefficients of the original MPEG-2 bitstream as possible. Experimental results show that the proposed method improves PSNR by about 0.190.31 dB compared with a conventional method.
    Computer Communications and Networks (ICCCN), 2010 Proceedings of 19th International Conference on; 09/2010
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper describes a software-based H.264/AVC HDTV real-time interactive CODEC architecture (named RISCA264-HD) using parallel processing. The RISCA264-HD consists of multiple high-speed encoder/decoder cores, an IP communication part, and an error recovery controller with FEC. It provides Full-HD quality (1920 × 1080 pixels, 29.97 frames per second) using parallel encoding, natural interactive conversation with low delay of less than 165 ms, and smooth visual communication free from macro block noises. This software with a home television and a home digital video camera achieves HDTV-quality bidirectional video communication via commercially IP broadband network.
    Consumer Electronics (ICCE), 2010 Digest of Technical Papers International Conference on; 02/2010
  • Takeshi Yoshitome, Ken Nakamura, Kazuto Kamikura
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose a flexible video CODEC system for super-high-resolution videos such as those utilizing 4k × 2k pixel. It uses the spatially parallel encoding approach and has sufficient scalability for the target video resolution to be encoded. A video shift and padding function has been introduced to prevent the image quality from being degraded when different active line systems are connected. The switchable cascade multiplexing function of our system enables various super-high-resolutions to be encoded and super-high-resolution video streams to be recorded and played back using a conventional PC. A two-stage encoding method using the complexity of each divided image has been introduced to equalize encoding quality among multiple divided videos. System Time Clock (STC) sharing has also been implemented in this CODEC system to absorb the disparity in the times streams are received between channels. These functions enable highly-efficient, high-quality encoding for super-high-resolution video. The system was used for the 6k × 1k video transmission of a soccer tournament and the 4k × 2k video recoding of SATIO KIKEN orchestral concert.
    12/2009: pages 177-195;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Over the past decade, video acquisition rates, which had been 24 Hz (cinema), 30-60 Hz (webcam) or 50-60 Hz (SD/HDcam), has broken through to reach 1000 Hz. In order to display these high frame-rate video signals on current display devices, they must be down-sampled first. This study proposes a down-sampling method suitable for high frame-rate video signals. It is designed with the goal of reducing the inter-frame prediction error and suppressing jerkiness between sub-sampled frames. Our method can improve the PSNR of prediction signal by 0.10 [dB] to 0.14 [dB] compared to simple sub-sampling with constant interval.
    Image Processing (ICIP), 2009 16th IEEE International Conference on; 12/2009
  • T. Yoshitome, K. Kamikura, N. Kitawaki
    [Show abstract] [Hide abstract]
    ABSTRACT: An MPEG-2 to H.264 intra transcoding method is proposed. This method uses the encoding information from an MPEG-2 stream and keeps as many DCT coefficients of the original MPEG-2 bitstream as possible. Experimental results show that the proposed method improves PSNR by about 0.76-1.27 dB compared with a typical conventional method.
    Industrial Electronics & Applications, 2009. ISIEA 2009. IEEE Symposium on; 11/2009
  • [Show abstract] [Hide abstract]
    ABSTRACT: Over the past decade, video acquisition rates, which had been 24 Hz (cinema), 30-60 Hz (webcam) or 50-60 Hz (SD/HDcam), has broken through to reach 1000 Hz. In order to display these high frame-rate video signals on current display devices in real time, they must be down-sampled first. This study proposes a down-sampling method suitable for high frame-rate video signals. It is designed with the goal of reducing the inter-frame prediction error. Our method can improve the PSNR of prediction signal by 0.13 [dB] to 0.23 [dB] compared to simple sub-sampling with constant interval.
    Picture Coding Symposium, 2009. PCS 2009; 06/2009
  • [Show abstract] [Hide abstract]
    ABSTRACT: Transform coding is a essential tool for picture coding applications, and coding schemes that can achieve high energy compaction are essential. In terms of energy compaction, Karhunen-Loeve transform (KLT) is known to be optimal. The energy compaction provided by KLT depends on the statistical property of the input signal and the dimensionality of KLT. However, the quantitative effect of KLT dimensionality on energy compaction has not been clarified. This paper establishes a mathematical model of the relationship among the dimensionality of KLT and the energy compaction of the transform coefficients, using mathematical tools in quantum information theory.
    Proceedings of the International Conference on Image Processing, ICIP 2009, 7-10 November 2009, Cairo, Egypt; 01/2009
  • [Show abstract] [Hide abstract]
    ABSTRACT: It is really important to use the proper coding mode in the H.264 encoder, since it offers many more modes than the conventional methods such as MPEG-2. Typical H.264 encoders like JM and JSVM use squared error as the criterion of distortion for mode decision. However, squared error does not always provide a correct measure of the distortion from the viewpoint of subjective quality. In this paper, we investigate a mode decision algorithm based on the spatio-temporal contrast sensitivity model in order to improve the H.264 encoder. Our algorithm is designed considering the direction dependency of spatio-temporal frequency. Experiments show that our method can achieve average bit-rate savings of the order of 4.0 to 5.5% compared to the original JSVM. We confirm that both methods yield reconstructed images that basically have the same subjective image quality.
    Image Processing, 2008. ICIP 2008. 15th IEEE International Conference on; 11/2008
  • [Show abstract] [Hide abstract]
    ABSTRACT: A real-time MVC viewer for free-viewpoint navigation is demonstrated. MVC, a video coding standard for multi-view video, is an extension of the state-of-the-art H.264/AVC. Realizing parallel decoding and view synthesis is essential to develop real-time viewers. A new decoder architecture is introduced for the proposed viewer. In addition, a file format for MVC is introduced. View synthesis is performed entirely within GPU. It is demonstrated that the MVC viewer can decode a MVC bitstream and generate a virtual camera image in real-time.
    Multimedia and Expo, 2008 IEEE International Conference on; 05/2008
  • [Show abstract] [Hide abstract]
    ABSTRACT: In order to represent 3D space, we have proposed to use the representation that consists of multi-view video plus a single view depth map. This format is backward compatible with the MPEG-C Part 3 (a.k.a. ISO/IEC 23002-3). This paper proposes a coding scheme on this 3D space representation that is efficient even if low delay decoding functionality is required. We apply the residual prediction framework to the view synthesis prediction errors. Experiments show that the proposed scheme achieves up to about 7.7% bitrate reduction compared to multi-view video coding with disparity compensation, even if the depth map video is added. Furthermore, the proposed scheme doesn't require any syntax changes to the conventional video coding standard and it needs few modifications on the circuits of conventional video codecs; it might be possible to reuse almost all of the components. The backward compatibility and reusability achieved by the proposed scheme are quite important to reduce manufacturing costs and time to market.
    Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on; 05/2008 · 4.63 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Realistic representations using extremely high quality images are becoming increasingly popular. For example, digital cinemas can now display moving pictures composed of high-resolution digital images. Although these applications focus on increasing the spatial resolution only, higher frame-rates are being considered to achieve more realistic representations. Since increasing the frame-rate increases the total amount of information, efficient coding methods are required. However, its statistical properties are not clarified. This paper establishes for high frame-rate video a mathematical model of the relationship between frame-rate and bit-rate. A coding experiment confirms the validity of the mathematical model.
    01/2008;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we propose an efficient scalable multi-view video coding with a novel algorithm to estimate panoramic mosaic depth maps. Multiple view-dependent depth maps are generated by applying a forward depth projection to decoded panoramic mosaic depth maps in the proposed codec. By us- ing the generated depth maps, the proposed scheme not only improves view synthesis prediction performances, but also achieves free-viewpoint scalability and coarse granular SNR scalability. These functionalities will be important to sup- port any kinds of D display by a single bitstream and to realize free-viewpoint television. Experiments show that the proposed scheme achieves about 7.4% bitrate reduction rel- ative to the most popular multi-view video coding method and also achieves almost the same efficiency as the conven- tional scheme which supports neither free-viewpoint scala- bility nor SNR scalability. Compared to one of the simplest free-viewpoint scalable multi-view coding, up to 16.8% bi- trate reduction is achieved by the proposed method.
    01/2008;
  • [Show abstract] [Hide abstract]
    ABSTRACT: Neighboring views must be highly correlated in multiview video systems. We should therefore use various neighboring views to efficiently compress videos. There are many approaches to doing this. However, most of these treat pictures of other views in the same way as they treat pictures of the current view, i.e., pictures of other views are used as reference pictures (inter-view prediction). We introduce two approaches to improving compression efficiency in this paper. The first is by synthesizing pictures at a given time and a given position by using view interpolation and using them as reference pictures (view-interpolation prediction). In other words, we tried to compensate for geometry to obtain precise predictions. The second approach is to correct the luminance and chrominance of other views by using lookup tables to compensate for photoelectric variations in individual cameras. We implemented these ideas in H.264/AVC with inter-view prediction and confirmed that they worked well. The experimental results revealed that these ideas can reduce the number of generated bits by approximately 15% without loss of PSNR.
    IEEE Transactions on Circuits and Systems for Video Technology 12/2007; · 1.82 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Surface light field is a light field parameterized to the surface of a geometric model. Although this multiview object representation enables rendering of objects from arbitrary views, the data size of a surface light field is huge. In this paper, a compression scheme is proposed. Together with compression efficiency, scalability and fast, memory efficient rendering is considered. To enable such functionalities, we base our compression scheme on a factorized representation of surface light field, and utilize a parameterization of the geometric model to a base mesh for application of wavelet transforms for progressive compression and scalable rendering. Experiments were conducted on a real world and synthetic data to study the compression efficiency and the characteristics of our proposed scheme.
    IEEE Transactions on Circuits and Systems for Video Technology 12/2007; · 1.82 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Multiview video coding demands high compression rates as well as view scalability, which enables the video to be displayed on a multitude of different terminals. In order to achieve view scalability, it is necessary to limit the inter-view prediction structure. In this paper, we propose a new multiview video coding scheme that can improve the compression efficiency under such a limited inter-view prediction structure. All views are divided into two groups in the proposed scheme: base view and enhancement views. The proposed scheme first estimates a view-dependent geometry of the base view. It then uses a video encoder to encode the video of base view. The view-dependent geometry is also encoded by the video encoder. The scheme then generates prediction images of enhancement views from the decoded video and the view-dependent geometry by using image-based rendering techniques, and it makes residual signals for each enhancement view. Finally, it encodes residual signals by the conventional video encoder as if they were regular video signals. We implement one encoder that employs this scheme by using a depth map as the view-dependent geometry and 3-D warping as the view generation method. In order to increase the coding efficiency, we adopt the following three modifications: (1) object-based interpolation on 3-D warping; (2) depth estimation with consideration of rate-distortion costs; and (3) quarter-pel accuracy depth representation. Experiments show that the proposed scheme offers about 30% higher compression efficiency than the conventional scheme, even though one depth map video is added to the original multiview video.
    IEEE Transactions on Circuits and Systems for Video Technology 12/2007; · 1.82 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Weighted prediction (WP) is a very efficient tool to encode video scenes that contain brightness variation caused by fade. Scalable Video Coding (SVC) extension of H.264/AVC can apply the WP tool of H.264/AVC to each spatial layer. How-ever, because the weighted parameter sets in the WP of SVC are assigned to every slice, coding efficiency is degraded if the brightness variation is non-uniform in the slice. We propose a new implicit mode WP for enhancement layers that can assign weighted parameter sets to every macroblock or macroblock partition without bit addition; the sets are derived by referring to reconstructed signals of the subordinate layer. Experiments show that the proposed implicit mode WP can achieve a sig-nificant coding gain (up to 8.22% with average of 2.23%) over white/black fade-in/out scenes versus the conventional WP of the SVC reference encoder.
    01/2007;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Higher frame-rates are being considered to achieve more realistic representations. Since increasing the frame-rate increases the total amount of information, efficient coding methods are required. However, the statistical properties of such data has not been clarified. This paper establishes, for high frame-rate video, a mathematical model of the relationship between frame-rate and bit-rate. The model incorporates the effect of the low-pass filtering induced by shutter open. By incorporating the open interval of shutter, our model can be extended to describes the various cases of downsampling frame-rates. A coding experiment confirms the validity of the mathematical model.
    Image Processing, 2007. ICIP 2007. IEEE International Conference on; 01/2007
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: It is really important to use the proper prediction mode in the H.264 encoder, since it offers many more modes than the conventional methods such as MPEG-2. Typical H.264 en-coders like JM and JSVM use squared error as the criterion of distortion for mode decision. However, squared error does not always provide a correct measure of the distortion from the viewpoint of subjective quality. In this paper, we investi-gate a mode decision method based on the human visual sys-tem in order to improve the H.264 encoder. We implement our method on reference software JSVM and compare with the original JSVM. Experiments show that our method can achieve average bit-rate savings of the order of 4.7 to 5.7% compared to the original JSVM. We confirm that both meth-ods yield reconstructed images that basically have the same subjective image quality.
    01/2007;
  • [Show abstract] [Hide abstract]
    ABSTRACT: We have proposed free-viewpoint video communications, in which a viewer can change the viewpoint and viewing angle when receiving and watching video content. A free-viewpoint video consists of several views, whose viewpoints are different. To freely and instantaneously change the viewpoint and view angle, a random access capability to decode the requested view with little delay is necessary. In this paper, a multiview video coding method to achieve high coding efficiency with low-delay random access functionality is proposed. In the proposed method, the GOP is the basic unit of a view, and selective reference picture memory management is applied to multiple GOPs to improve coding efficiency. In addition, the coding method of disparity vectors, which utilizes the camera arrangement, is proposed. © 2007 Wiley Periodicals, Inc. Syst Comp Jpn, 38(5): 14– 29, 2007; Published online in Wiley InterScience (). DOI 10.1002/scj.20683
    Systems and Computers in Japan 01/2007; 38:14-29.
  • [Show abstract] [Hide abstract]
    ABSTRACT: Higher frame-rates are being considered to achieve more realistic representations. Since increasing the frame-rate increases the total amount of information, efficient coding methods are required. However, the statistical properties of such data has not been clarified. This paper establishes, for high frame-rate video, a mathematical model of the relationship between frame-rate and bit-rate. The model incorporates the effect of the integral phenomenon which occurs when the frame-rate is downsampled. A coding experiment confirms the validity of the mathematical model
    Image Processing, 2006 IEEE International Conference on; 11/2006