Bo Tao

Los Gatos Research, Mountain View, California, United States

Are you Bo Tao?

Claim your profile

Publications (21)14.67 Total impact

  • B. Tao, M.T. Orchard
    [Show abstract] [Hide abstract]
    ABSTRACT: We find the optimal window for overlapped block motion compensation (OBMC) by statistically modeling the motion field, the field of block motion estimates and their relationship. This enables us to show how the optimal OBMC window is affected by random field parameters, such as the variance of the motion field and the correlation coefficients of both the intensity field and the motion field. The OBMC window obtained in this fashion is shown to have good performance in reducing the prediction error. Furthermore, this parametric solution provides insight into motion uncertainty and the overlapped motion compensation process
    IEEE Transactions on Image Processing 04/2001; · 3.20 Impact Factor
  • Bo Tao, M.T. Orchard
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper analyzes the relationship between the residual frame and the previous frame in motion-compensated video coding. It is found that the variance of the residual signal depends on the gradient magnitude. On average, the variance of the residual signal is larger for pixels with larger gradient magnitude. Two applications of this analysis are presented. In the first one, the relationship between the residual signal variance and the gradient magnitude is used to model the second-order statistics of the residual field in a nonstationary way. This modeling enables more efficient residual signal coding. The other application is for pixel decimation-based fast block matching. It is proposed that pixels with the largest gradient magnitude in a block be chosen to participate in the block matching process. It is demonstrated that such a gradient-adaptive subsampling achieves great advantage over two other known subsampling methods
    IEEE Transactions on Image Processing 02/2001; · 3.20 Impact Factor
  • Bo Tao, M.T. Orchard
    [Show abstract] [Hide abstract]
    ABSTRACT: In a standard hybrid video coder, there are two important factors affecting motion-compensated prediction: motion uncertainty and quantization noise, motion uncertainty results from the use of block-decimated motion compensation, which cannot specify the motion for each pixel. The propagation of quantization noise results from interframe prediction. We analyze both these effects, and their block-structured nonstationary properties. We propose a block-adaptive linear filtering framework to reduce motion uncertainty and quantization noise propagation simultaneously. This new motion-compensated predictor can be viewed as joint application of overlapped block motion compensation and loop filtering (LF). Several system configurations are evaluated. We show that this linear filtering framework achieves better rate-distortion performance than the single use of either overlapped block motion compensation or LF
    IEEE Transactions on Circuits and Systems for Video Technology 02/2001; · 1.82 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: We present an adaptive model-driven bit-allocation algorithm for video sequence coding. The algorithm is based on a parametric rate-distortion model, and facilitates both picture-and macroblock-level bit allocation. A region classification scheme is incorporated into the algorithm, which exploits characteristics of human visual perception to efficiently allocate bits according to a region's visual importance. The application of this algorithm to MPEG video coding is discussed in detail. We show that the proposed algorithm is computationally efficient and has many advantages over the MPEG-2 TM5 bit-allocation algorithm
    IEEE Transactions on Circuits and Systems for Video Technology 03/2000; · 1.82 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Our starting point is gradient indexing, the characterization of texture by a feature vector that comprises a histogram derived from the image gradient field. We investigate the use of gradient indexing for texture recognition and image retrieval. We find that gradient indexing is a robust measure with respect to the number of bins and to the choice of the gradient operator. We also find that the gradient direction and magnitude are equally effective in recognizing different textures. Furthermore, a variant of gradient indexing called local activity spectrum is proposed and shown to have improved performance. Local activity spectrum is employed in an image retrieval system as the texture statistic. The retrieval system is based on a segmentation technique employing a distance measure called Sum of Minimum Distance. This system enables content-based retrieval of database images from templates of arbitrary size.
    Journal of Visual Communication and Image Representation. 01/2000;
  • IEEE Trans. Circuits Syst. Video Techn. 01/2000; 10:147-157.
  • Source
    Xin Li, Bo Tao, M.T. Orchard
    [Show abstract] [Hide abstract]
    ABSTRACT: Arbitrary linear transforms are often applied to integer valued data sequences, producing non-integer outputs. In such cases, it is often desirable to approximate the true transform with a non-linear transform producing integer outputs as close as possible to the true transform, while retaining the reversibility property of a transform. This paper proposes a new approach to implement transforms that map integers to integers in a reversible way. The approach aims to develop a reversible implementation that minimizes the average distortion between the integer outputs and the outputs of the true linear transform
    Image Processing, 1998. ICIP 98. Proceedings. 1998 International Conference on; 11/1998
  • Bo Tao, M.T. Orchard
    [Show abstract] [Hide abstract]
    ABSTRACT: Motion-compensated prediction is widely used in video coding. In current coders, it only predicts the first-order statistics (mean value). However, we show in this paper that the variance of each individual pixel in the residual field can be predicted too, by relating it to the gradient magnitude computed from the previous frame. This relationship leads to designing the residual field coder such that optimal Karhunen-Loeve transforms can be computed for each block. Furthermore, no matter what transform is used, the variance of each individual transform coefficient is known to both ends of the coding process, and hence conditional entropy coding can be implemented
    Image Processing, 1998. ICIP 98. Proceedings. 1998 International Conference on; 11/1998
  • Bo Tao, B.W. Dickinson
    [Show abstract] [Hide abstract]
    ABSTRACT: We investigate the use of gradient indexing for texture discrimination. From a histogram of its gradient field, a texture can be recognized. We find that gradient indexing is insensitive to the number of bins and to the choice of the gradient operator. We also find that the gradient direction and magnitude are both important in recognizing different textures. We further propose a modified gradient indexing technique called local activity spectrum, which has improved performance
    Image Processing, 1998. ICIP 98. Proceedings. 1998 International Conference on; 11/1998
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: An adaptive watermarking technique is introduced in this work. A regional perceptual classifier is employed to assign a noise sensitivity index to each region. The watermark is inserted in the original image according to this index by using block DCT. The detection of the watermark is designed to achieve a desired false alarm probability.
    12/1997;
  • Bo Tao, M.T. Orchard
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents an analysis of the block-decimated motion estimates and relates them to the underlying motion random field. It further parameterizes the scene intensity random field and the motion random field in terms of their correlation properties. Within this framework, we develop an algorithm to optimize the window for overlapped block motion compensation as a function of the model parameters. Through simulations, we demonstrate that the optimal window resulting from the parametric formulation offers performance comparable to the window deterministically optimized for the test sequence, and it offers more robust performance outside the training set. Finally, we apply our algorithm to adapt the overlapped window to match the temporally changing characteristics of the scene and motion fields. We demonstrate that for real-time applications, where the number of frames used for adapting the window is limited, our algorithm significantly outperforms the method introduced by Orchard and Sullivan (see IEEE Trans. Image Processing, vol.3, no.5, p.693-9, 1994).
    Signals, Systems & Computers, 1997. Conference Record of the Thirty-First Asilomar Conference on; 12/1997
  • [Show abstract] [Hide abstract]
    ABSTRACT: Motion uncertainty and image quantization noise are two important factors affecting the performance of block motion compensation in a standard hybrid video coder. This paper models both these effects, and analyzes their nonstationary property. It proposes a strategy that jointly applies space-varying overlapped block motion compensation and loop filtering, achieving better rate-distortion performance than either method alone
    Image Processing, 1997. Proceedings., International Conference on; 11/1997
  • [Show abstract] [Hide abstract]
    ABSTRACT: We derive here a parametric rate-quantization model, based on traditional rate-distortion theory, for MPEG encoders. Given the bit budget for a picture, our model calculates a baseline quantization scale factor. We compare our approach to a technique using an ad-hoc rate-quantization model, and also to TM5. In both cases, experimental results demonstrate that our approach produces an actual encoded bitrate closer to the target bit budget for the picture, as well as an improved peak signal-to-noise ratio
    Image Processing, 1997. Proceedings., International Conference on; 11/1997
  • B. Tao, B. Dickinson
    [Show abstract] [Hide abstract]
    ABSTRACT: An adaptive watermarking technique is introduced in this work. A regional perceptual classifier is employed to assign a noise sensitivity index to each region. The watermark is inserted in the original image according to this index by using block DCT. The detection of the watermark is designed to achieve a desired false alarm probability
    Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on; 05/1997
  • [Show abstract] [Hide abstract]
    ABSTRACT: We study the problem of retrieving images using a small template. The goal is to allow a user to search for images containing a pattern similar to the template, adding to the capability of a search engine. We propose to employ a segmentation-based approach. As a specific example, we introduce a quadtree segmentation technique for textured images and a distance measure, Sum of Minimum Distance, suitable for template-based image retrieval applications.© (1997) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.
    01/1997;
  • [Show abstract] [Hide abstract]
    ABSTRACT: We introduce two texture classification techniques applicable to images compressed using block DCT. The first technique is a parametric approach. It models a texture as a stationary Gaussian process and utilizes the diagonalizing property of DCT. The second one uses the concept of power spectrum in the DCT domain. The energy distribution is employed to discriminate different textures. Both techniques work on compressed data without decoding and are designed to be robust against quantization noise.
    Electronic Imaging '97; 01/1997
  • [Show abstract] [Hide abstract]
    ABSTRACT: We study the problem of retrieving images using a small template. The goal is to allow a user to search for images containing a pattern similar to the template, adding to the capability of a search engine. We propose to employ a segmentation-based approach. As a specific example, we introduce a quadtree segmentation technique for textured images and a distance measure, Sum of Minimum Distance, suitable for template-based image retrieval applications.
    01/1997
  • [Show abstract] [Hide abstract]
    ABSTRACT: We introduce two texture classification techniques applicable to images compressed using block DCT. The first technique is a parametric approach. It models a texture as a stationary Gaussian process and utilizes the diagonalizing property of DCT. The second one uses the concept of power spectrum in the DCT domain. The energy distribution is employed to discriminate different textures. Both techniques work on compressed data without decoding and are designed to be robust against quantization noise.
    Proc SPIE 01/1997;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Video databases can be searched for visual content by searching over automatically extracted key frames rather than the complete video sequence. Many video materials used in the humanities and social sciences contain a preponderance of shots of people. In this paper, we describe our work in semantic image retrieval of person-rich scenes (key frames) for video databases and libraries. We use an approach called retrieval through segmentation. A key-frame image is first segmented into human subjects and background. We developed a specialized segmentation technique that utilizes both human flesh-tone detection and contour analysis. Experimental results show that this technique can effectively segment images in a low time complexity. Once the image has been segmented, we can then extract features or pose queries about both the people and the background. We propose a retrieval framework that is based on the segmentation results and the extracted features of people and background.
    Proc SPIE 12/1996;
  • [Show abstract] [Hide abstract]
    ABSTRACT: In this work, we study the relationship between content- based image retrieval and pattern recognition, by modeling the image retrieval process in a probabilistic method. A model called random image database will be presented, together with a retrieval quality measure called probability of self similar, which enables us to establish the link between image retrieval and pattern recognition. The main result is that such a quality measure is uniformly upperbounded by its pattern recognition counterpart using nearest neighbor rule, when only one training sample is available for each class. Therefore a feature measure having better performance in the one training sample per class case should be favored over features doing well in large training sample situations.
    Proc SPIE 11/1996;