R. Ward

University of British Columbia - Vancouver, Vancouver, British Columbia, Canada

Are you R. Ward?

Claim your profile

Publications (50)39.82 Total impact

  • [Show abstract] [Hide abstract]
    ABSTRACT: As for down-sizing MPEG-2 to H.264/AVC transcoding, an efficient algorithm of estimating initial H.264/AVC motion vectors is proposed. By using the estimated initial motion vectors, only a small range of motion vector refinement is sufficient to find the final motion vector for each partition. Experimental results show that our proposed algorithm achieves average 0.08 dB improvement (maximum 0.24 dB) in the picture quality compared to the other state-of-art method. At the same time, the computational complexity of estimating the initial motion vectors is less than that of the other state-of-art technique (average 36% reduction).
    Image Processing (ICIP), 2009 16th IEEE International Conference on; 12/2009
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Angle Quantization Index Modulation (AQIM) is a watermarking technique that achieves provably good capacity-distortion-robustness performance. The scheme embeds each bit of a watermark message on an image by quantizing the angle formed by a vector of pixels or frequency coefficients from the host image with the origin of a hyperspherical coordinate system. While it has been demonstrated that AQIM is insensitive to intensity scaling attacks, the quantization step employed throughout the embedding process is set to a constant value regardless of the nature of the image. Because of this, the perceptual quality of the watermarked image cannot be regulated. In this paper, we propose a straightforward yet powerful AQIM-based watermarking scheme that considers the statistical behavior of the region where a message bit is to be embedded before settling on the size of the quantization step. Experimental results show that, for the same watermark capacity and image fidelity parameters, our proposed region-specific scheme exhibits a lower bit error rate than the regular AQIM approach.
    Broadband Multimedia Systems and Broadcasting, 2009. BMSB '09. IEEE International Symposium on; 06/2009
  • Source
    Shan Du, R. Ward
    [Show abstract] [Hide abstract]
    ABSTRACT: The pose variation involved in facial images significantly degrades the performance of face recognition systems. In this paper, a component-wise pose normalization method for facilitating pose-invariant face recognition is proposed. The main idea is to normalize a non-frontal facial image to a virtual frontal image component by component. In this method, we first partition the whole non-frontal facial image into different facial components and then the virtual frontal view for each component is estimated separately. The final virtual frontal image is generated by integrating the virtual frontal components. The proposed method relies only on 2D images, therefore complex 3D modeling is not needed. The experimental results using the CMU-PIE database demonstrate the advantages of the proposed method.
    Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on; 05/2009
  • [Show abstract] [Hide abstract]
    ABSTRACT: We propose a wavelet-based intra-prediction image coding scheme that is very efficient for lossless compression. This scheme takes advantage of JPEG-2000's wavelet decomposition and H.264's intra-prediction capability to compress still images. The compression performance is analyzed, and the results demonstrate that our method outperforms H.264 intra-mode coding by about 7% and JPEG-2000 by around 15% for the lossless case.
    Consumer Electronics, 2009. ICCE '09. Digest of Technical Papers International Conference on; 02/2009
  • [Show abstract] [Hide abstract]
    ABSTRACT: Multimedia applications such as video streaming and mobile TV are emerging as the most promising applications over wireless networks. The increased coding efficiency and network friendly architecture of the latest video coding standard H.264/AVC has facilitated the delivery of coded video content to wireless users. However, wireless networks allow lower transmission bit-rates than wired networks while the display resolution of mobile devices is generally smaller than that of standard definition (SD) TV. This calls for fast bit-rate reduction techniques through video transcoding that can deliver the best video quality to the mobile receiver while adhering to the bit-rate constraints of the wireless network. In this paper, we present a bit-rate estimation model that speeds up the transcoding process by predicting the transcoded video bit-rate for different spatial resolution reduction ratios and quantization steps. We demonstrate that, on average, our proposed model can accurately estimate the bit-rate of the transcoded video to within 5% of the actual bit-rate of the transcoded video.
    Wireless Pervasive Computing, 2008. ISWPC 2008. 3rd International Symposium on; 06/2008
  • [Show abstract] [Hide abstract]
    ABSTRACT: The demand for interactive TV services is rapidly increasing. Most of the interactive TV systems are based on the DVB-MHP (digital video broadcast - multimedia home platform) format. Given the unprecedented success of DVD technology and its already established installed base, it is definitely in the interest of manufactures and end-users to be able to play back DVB-MHP iTV content using a Blu-ray player. This study addresses how such programs could be played by the Blu-ray system in real-time. One of the main challenges in realizing this compatibility is the conversion of the "system information " data. These data carry the information about the broadcasting programs and services. This paper first analyzes the differences in system information between the two standards, DVB-MHP and Blu-ray; mainly the information each standard requires, where this information is stored, and how it is organized. We then propose methods to transcode this information in real-time. To derive the data required to generate the Blu-ray system information from a transport stream, we propose a data retrieval scheme that avoids fully demultiplexing a transport stream. To extract the needed information from the DVB-MHP video stream, we propose an efficient search algorithm. Finally, an improved structure that transcodes the incoming system information is developed. This transcoder only transcodes the updated versions of the system information and at the same time keeps the rate at which this information is transmitted to the Blu-ray system the same as that in DVB-MHP.
    IEEE Transactions on Consumer Electronics 06/2008; · 1.09 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: One objective in MPEG-2 to H.264 transcoding is to improve the H.264 compression ratio by using more accurate H.264 motion vectors. Motion re-estimation is by far the most time consuming process in video transcoding, and improving the searching speed is a challenging problem. We introduce a new transcoding scheme that uses the MPEG-2 DCT coefficients to predict the block size partitioning for H.264. Performance evaluations have shown that, for the same rate-distortion performance, our proposed scheme achieves an impressive reduction in the computational complexity of more than 82% compared to the full range motion estimation used by H.264.
    Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on; 05/2008
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: A new video watermarking algorithm for access control is introduced. This method is content- dependent and uses the dual tree complex wavelet transform (DT CWT) to create a watermark that is robust to geometric distortions and lossy compression. The watermark is a random array of 1's and -1's. A one-level DT CWT is applied to this watermark and the coefficients of this transformation are embedded into selected frequency components of the video sequence. The robustness of this method is tested against a joint attack, which involves rotation, scaling, cropping and H.264 video compression.
    Digital Information Management, 2007. ICDIM '07. 2nd International Conference on; 11/2007
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Although open-loop transcoding is known as the most computational efficient transcoding structure, it is also known to introduce many distortions in the transcoded video. This paper addresses the chrominance distortions resulting from the open-loop MPEG2 to H.264 transcoding structure and proposes algorithms to compensate for the chrominance distortions. The open-loop structure is replaced by a closed-loop transcoding structure, which provides high-quality video by removing the chrominance distortions, resulting in an average of 6 dB picture quality improvement
    Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on; 05/2007
  • Source
    Shan Du, R. Ward
    [Show abstract] [Hide abstract]
    ABSTRACT: Illumination variation is a main obstacle in facial feature detection. This paper presents a novel automated approach that localizes eyes in gray-scale face images and that is robust to illumination changes. The approach does not require prior knowledge about face orientation and illumination strength. Other advantages are that no initialization and training process are needed. Based on an edge map obtained via multi-resolution wavelet transform, this approach first segments an image into different inhomogeneously illuminated regions. The illumination of every region is then adjusted so that the features' details are more pronounced. To locate the different facial features, for every region, Gabor-based image is constructed from the re-lit image. The eyes sub-regions are then identified using the edge map of the re-lit image. This method has been applied successfully to the images of the Yale B face database that have different illuminations.
    Image Processing, 2007. ICIP 2007. IEEE International Conference on; 01/2007
  • Shan Du, R. Ward
    [Show abstract] [Hide abstract]
    ABSTRACT: Illumination changes in face images form a main obstacle in face recognition systems. To deal with this problem, this study presents a novel adaptive region-based image preprocessing scheme that enhances face images and facilitates the face recognition task. This method enhances both the edges and the contrast in face images regionally so as to alleviate the side lighting effects. Compared with the conventional global histogram equalization method, our method is shown to be more suitable for dealing with uneven illuminations in face images. This method is evaluated on the Yale B face database. The experimental results show the advantages of the proposed method with an improvement of 16.1% on average over the histogram equalization method
    Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on; 06/2006
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: H.264, with its superior coding efficiency and network friendly design, has emerged as the newest international video standard and is expected to become the preferred codec for video broadcasting. Presently, there is an effort to add scalability to H.264, in order to offer a solution to network congestion and bandwidth variations. Proposals range from scalable subband extension to introduction of FGS to the H.264 standard. We propose a novel structure for the FGS layer that uses 4×4 integer transform, instead of 8×8 discrete cosine transform (DCT), so that the same transform is used for both layers. We also propose a novel hierarchical algorithm to code macroblock header of FGS layer that uses less bits than the standard FGS algorithm and significantly increases the coding efficiency. When compared with the original FGS structure, the proposed structure uses 70% less bits for macroblock headers, has less complexity and has an increased PSNR of 0.7 dB on average.
    Signal Processing Systems Design and Implementation, 2005. IEEE Workshop on; 12/2005
  • [Show abstract] [Hide abstract]
    ABSTRACT: Blurring effect is among the major visual degradations resulting from digital image enlargement. We introduce a set of sigmoidal functions to sharpen edges of enlarged images. We show that these functions have properties that are especially desirable for sharpening expanded edges. We then develop an adaptive contrast enhancement scheme based on edge sharpening using these functions. It is shown that this scheme results in images of better contrast and natural-looking sharp edges.
    Image Processing, 2005. ICIP 2005. IEEE International Conference on; 10/2005
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: A watermarking algorithm that relies on informed coding and informed embedding is presented. This method embeds one bit of the watermark in every image block, but offers higher watermarked image quality as well as higher robustness to image processing operation attacks than other known methods using informed coding and informed embedding. The method is shown to withstand high values of added Gaussian noise, valumetric scaling, low-pass filtering as well as lossy JPEG compression. Each bit (0 or 1) to be embedded is represented by a subset of codewords. For every image block, a vector is extracted. This vector is modified so that its correlation with the codewords related to the bit to be embedded in it has higher probability than those of the codewords representing the other bit even if the image is later modified by image processing operations. The vector modification is also carried so that the change in the image fidelity is minimal.
    Image Processing, 2005. ICIP 2005. IEEE International Conference on; 10/2005
  • Source
    Shan Du, R. Ward
    [Show abstract] [Hide abstract]
    ABSTRACT: The appearance of a face image is severely affected by illumination conditions that hinder the automatic face recognition process. To recognize faces under varying illuminations, we propose a wavelet-based normalization method so as to normalize illuminations. This method enhances the contrast as well as the edges of face images simultaneously, in the frequency domain using the wavelet transform, to facilitate face recognition tasks. It outperforms the conventional illumination normalization method - the histogram equalization that only enhances image pixel gray-level contrast in the spatial domain. With this method, our face recognition system works effectively under a wide range of illumination conditions. The experimental results obtained by testing on the Yale face database B demonstrate the effectiveness of our method with 15.65% improvement, on average, in the face recognition system.
    Image Processing, 2005. ICIP 2005. IEEE International Conference on; 10/2005
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: A system for 3-D reconstruction of a rigid object from monocular video sequences is introduced. Initially an object pose is estimated in each image by locating similar (unknown) texture assuming flat depth map for all images. Shape-from-silhouette as stated in R. Szeliski (1993) is then applied to construct a 3-D model which is used to obtain better pose estimates using a model-based method. Before repeating the process by building a new 3-D model, pose estimates are adjusted to reduce error by maximizing a quality measure for shape-from-silhouette volume reconstruction. Translation of the object in the input sequence is compensated in two stages. The volume feedback is terminated when the updates in pose estimates become small. The final output is a pose index (the last set of pose estimates) and a 3-D model of the object. Good performance of the system is shown by experiments on a real video sequence of a human head. Our method has the following advantages: (1) No model is assumed for the object. (2) Feature points are neither detected nor tracked, thus no problematic feature matching or lengthy point tracking are required. (3) The method generates a high level pose index for the input images, these can be used for content-based retrieval. Our method can also be applied to 3-D object tracking in video.
    Circuits and Systems, 2004. ISCAS '04. Proceedings of the 2004 International Symposium on; 06/2004
  • [Show abstract] [Hide abstract]
    ABSTRACT: A communication network, which offers a deterministic QoS guarantee to VBR traffic sources, must use a traffic regulation scheme to reserve network resources for each source. The key component of a traffic regulation scheme is the traffic characterization model used to characterize the traffic of each source. The (σ&oarr;, ρ&oarr;)model is so far the most popular traffic model used in communication networks. In order to achieve high network utilization, parameters of the traffic model should be selected carefully, such that the model specifies the actual traffic as accurately as possible. We present a novel method for selecting the parameters of a (σ&oarr;, ρ&oarr;)model for a VBR traffic source. Our method strives for accuracy, implementation simplicity and execution speed as the design goals. Our approach consists of two parts: 1) constructing the empirical envelope of the source from the traffic, and 2) finding the model parameters from the empirical envelope. We present novel solutions for these two problems. Our method for constructing the empirical envelope is faster and more accurate than the presently existing methods and can be employed in real-time applications. Our method for finding the (σ&oarr;, ρ&oarr;)model parameters from the empirical envelope is based on the 'divide and conquer' and sequential programming optimization techniques, and finds a near optimum result. The performance and accuracy of our methods were experimentally compared to other available methods. The results showed that the overall performance, specifically the speed and the accuracy of our methods, are significantly better than the current methods found in the literature.
    Information Technology: Research and Education, 2003. Proceedings. ITRE2003. International Conference on; 09/2003
  • [Show abstract] [Hide abstract]
    ABSTRACT: A novel method, that selectively enhances the visually important regions in scalable H.264 video encoding, is proposed. The proposed method is extremely fast and is designed to be used in real-time video communications systems. The method is based on fine granular scalability (FGS). FGS provides a framework to adapt to variations in the channel bandwidth, and it was recently standardized in the streaming video profile of MPEG-4. FGS also provides to the encoder the ability to selectively enhance the regions that are visually important, increasing the subjective video quality. In this paper we use the emerging video coding standard, H.264, for encoding the base layer, as opposed to MPEG-4. H.264 has several key differences with its predecessor standards, one of them being the new inter coded macroblock types. It was observed that specific macroblock types indicate visually important regions. In our proposed method, the macroblock (MB) type information along with motion vector (MV) data are used to extract features such as motion activity and camera motion. These features are used for defining the regions to be enhanced. Since, all of the operations are taking place on a block-by-block basis, the method presented here has very low computational complexity and suitable for real-time video communication systems. Experimental results show that subjective quality of the video sequence is significantly improved using our method.
    Electrical and Computer Engineering, 2003. IEEE CCECE 2003. Canadian Conference on; 06/2003
  • [Show abstract] [Hide abstract]
    ABSTRACT: In digital TV systems, variable bitrate encoding is usually used to improve the bandwidth usage efficiency. Since the transmission channel has a fixed bandwidth, this leaves some portions of the bandwidth unoccupied. This free space can be used to transmit extra data to enhance the TV content, which can be either discrete, like text, or streaming, like video and audio. We address the problem of adding time-sensitive streaming data to a TV program. The crucial part of this problem is a scheduling algorithm that guarantees the on-time delivery of the incidental data to the decoder. We present a sophisticated time-sensitive scheduling algorithm for off-line multiplexing of TV programs and incidental streaming data. The two important features of our algorithm are: 1) it minimizes the presentation delay for incidental and main streams; 2) it minimizes the required decoder buffer size for incidental data. Comparing the experimental results of our algorithm with existing scheduling methods shows that our algorithm significantly reduces the presentation delay and the decoder buffer size for the incidental streams.
    Digital Signal Processing, 2002. DSP 2002. 2002 14th International Conference on; 02/2002
  • Source
    M.D. Adams, R. Ward
    [Show abstract] [Hide abstract]
    ABSTRACT: Two lifting-based families of symmetry-preserving reversible integer-to-integer wavelet transforms are studied. The transforms from both of these families are shown to be compatible with symmetric extension, which permits the treatment of arbitrary length signals in a nonexpansive manner. Throughout this work, particularly close attention is paid to rounding functions, and the properties that they must possess in various instances. Symmetric extension is also shown to be equivalent to constant per-lifting-step extension in certain circumstances.
    Acoustics, Speech, and Signal Processing, 2002. Proceedings. (ICASSP '02). IEEE International Conference on; 02/2002