-
[show abstract]
[hide abstract]
ABSTRACT: Polarization, a property of light that conveys information about the transverse electric field orientation, complements other attributes of electromagnetic radiation such as intensity and frequency. Using multiple passive polarimetric images, we develop an iterative, model-based approach to estimate the complex index of refraction and apply it to target classification.
IEEE Transactions on Image Processing 02/2011; · 3.04 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: We consider here image segmentation as a problem of clustering texture features by frequency content. Specifically, we develop a low complexity algorithm for image segmentation that operates directly on the bitstream of JPEG compressed images. Using morphological filtering and watersheds, the algorithm effectively segments an image by combining areas of similar frequency content. Its low complexity and the fact that it does not require decoding makes it well-suited for distributed wireless sensor networks and image database search applications.
Digital Signal Processing Workshop and 5th IEEE Signal Processing Education Workshop, 2009. DSP/SPE 2009. IEEE 13th; 02/2009
-
[show abstract]
[hide abstract]
ABSTRACT: A perceptually scalable audio coder generates a bit-stream that contains layers of audio fidelity and is encoded in such a way that adding one of these layers enhances the reconstructed audio by an amount that is just noticeable by the listener. Such algorithms have applications like music on demand at variable levels of fidelity, for instance using 3G and 4G cellular radio systems operating at different bit rates. While the MPEG-4 natural audio coder can create finely scalable bit streams using bit sliced arithmetic coding (BSAC), its perceptual quality at low bit rates is poor. On the other hand, the nonscalable transform-domain weighted interleaved vector quantization (TWIN-VQ) performs well at low bit rates. In this paper, we present a modified version of TWIN-VQ algorithm that generates a perceptually scalable bit-stream with many fine layers of audio fidelity. Using TWIN-VQ as our base ensures the best possible perceptual quality at low bit rates. Specifically, the proposed scalable algorithm performs as well as TWIN-VQ at rates of 8 to 16 kb/s and outperforms scalable BSAC by between 64% and 172% at rates of less than 24 kb/s.
IEEE Transactions on Audio Speech and Language Processing 08/2008; · 1.50 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: In this paper, we develop a low complexity algorithm for spatial overlap detection and characterization that operates directly on the bitstream of motion-JPEG compressed video. Its low complexity and the fact that it does not require video decoding at the sensor nodes make it well suited to multi-view distributed video coding applications for wireless sensor networks.
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on; 05/2008 · 4.63 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Passive imaging polarimetry has emerged as a useful tool in many remote sensing applications including material classification, target detection and shape extraction. In this paper we present a method to classify specular objects based on their material composition from passive polarimetric imagery. The proposed algorithm is built on an iterative, model-based method to recover the complex index of refraction of a specular target from multiple polarization measurements. The recovered parameters are then used to discriminate between objects by employing the nearest neighbor rule. Experimental results indicate that the classification approach is highly effective for distinguishing between various targets of interest. Most significantly, the proposed classification method is robust to a wide range of observational geometry.
Image Analysis and Interpretation, 2008. SSIAI 2008. IEEE Southwest Symposium on; 04/2008
-
[show abstract]
[hide abstract]
ABSTRACT: In this paper, we develop a low complexity algorithm for spatial overlap detection and characterization that operates directly on the bitstream of motion- JPEG compressed video. Its low complexity and the fact that it does not require video decoding at the sensor nodes makes it well suited to multi-view distributed video coding applications for wireless sensor networks. In addition to determining overlap, we also show using the MERL ballroom database that the proposed algorithm can accurately identify and localize non-common foreground objects (e.g., dancers) within the common overlap region of two camera views.
Image Analysis and Interpretation, 2008. SSIAI 2008. IEEE Southwest Symposium on; 04/2008
-
[show abstract]
[hide abstract]
ABSTRACT: The goal of this paper is to develop an audio quality metric that can accurately quantify subjective quality over audio fidelities ranging from highly impaired to perceptually lossless. As one example of its utility, such a metric would allow scalable audio coding algorithms to be easily optimized over their entire operating ranges. We have found that the ITU-recommended objective quality metric, ITU-R BS.1387, does not accurately predict subjective audio quality over the wide range of fidelity levels of interest to us. In developing the desired universal metric, we use as a starting point the model output variables (MOVs) that make up BS.1387 as well as the energy equalization truncation threshold which has been found to be particularly useful for highly impaired audio. To combine these MOVs into a single quality measure that is both accurate and robust, we have developed a hybrid least-squares/minimax optimization procedure. Our test results show that the minimax-optimized metric is up to 36% lower in maximum absolute error compared to a similar metric designed using the conventional least-squares procedure.
IEEE Transactions on Audio Speech and Language Processing 02/2008; · 1.50 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Perceptual audio compression uses the idea of auditory masking to hide coding distortion. These auditory masking thresholds are obtained from mathematical models of the human ear. At low bitrates however, coding noise is significant and cannot be masked by the audio content. A perceptually scalable audio compression system, even at low bitrates, should generate a bitstream with layers of audio fidelity such that each layer improves the quality of the reconstructed audio that is just noticeable by the listener. In this paper we describe a low bitrate (8-64 kbps), scalable audio compression system which uses a residual weighted VQ algorithm to generate a scalable bitstream. To modify this bitstream so that it is perceptually scalable, we layer the different residual indices generated by the coder using objective audio quality metrics developed for evaluating highly impaired audio signals.
Signals, Systems and Computers, 2007. ACSSC 2007. Conference Record of the Forty-First Asilomar Conference on; 12/2007
-
[show abstract]
[hide abstract]
ABSTRACT: A passive polarization based imaging system records the polarization state of light reflected by objects that are illuminated with an unpolarized and generally uncontrolled source. Such systems can be useful in many remote sensing applications including target detection, object segmentation and material classification. In this paper we present a method to jointly estimate the complex index of refraction and the view angle of a target from multiple measurements collected by a passive polarimeter. This generalizes our previous work which was applicable only to dielectric targets. An expression for the degree of polarization is derived from the microfacet polarimetric bidirectional reflectance model for the case of scattering in the place of incidence. Using this expression, we develop nonlinear least squares estimation algorithms for extracting the complex index of refraction and view angle from multiple polarization measurements. The effectiveness of the proposed method is validated with data collected in laboratory conditions. Experimental results indicate that the proposed method is effective for recovering the parameters of interest for real world data and that the complex index of refraction thus computed provides a feature vector that is robust to the view angle.
Signals, Systems and Computers, 2006. ACSSC '06. Fortieth Asilomar Conference on; 12/2006
-
[show abstract]
[hide abstract]
ABSTRACT: A perceptually scalable audio coder generates a bit-stream that contains layers of audio fidelity and is encoded in such a way that adding one of these layers enhances the reconstructed audio by an amount that is just noticeable by the listener. Such algorithms have applications like music on demand at variable levels of fidelity for 3G and 4G cellular radio systems operating at different bit rates. While the MPEG-4 natural audio coder can create scalable bit streams, its perceptual quality at low bit rates is poor. On the other hand, the non scalable TWIN-VQ performs well at low bit rates. In this paper we present a technique to modify the TWIN-VQ algorithm such that it generates a perceptually scalable bit-stream with layers of audio fidelity. Using the TWIN-VQ as our base ensures the best possible perceptual quality at low bit rates (8 - 16 kbps).
Data Compression Conference, 2006. DCC 2006. Proceedings; 04/2006
-
C.D. Creusere
[show abstract]
[hide abstract]
ABSTRACT: In this paper, we study coding artifacts in MPEG-compressed scalable audio. Specifically, we consider the MPEG advanced audio coder (AAC) using bit slice scalable arithmetic coding (BSAC) as implemented in the MPEG-4 reference software. First we perform human subjective testing using the comparison category rating (CCR) approach, quantitatively comparing the performance of scalable BSAC with the nonscaled TwinVQ and AAC algorithms. This testing indicates that scalable BSAC performs very poorly relative to TwinVQ at the lowest bitrate considered (16 kb/s) largely because of an annoying and seemingly random mid-range tonal signal that is superimposed onto the desired output. In order to better understand and quantify the distortion introduced into compressed audio at low bit rates, we apply two analysis techniques: Reng bifrequency probing and time-frequency decomposition. Using Reng probing, we conclude that aliasing is most likely not the cause of the annoying tonal signal; instead, time-frequency or spectrogram analysis indicates that its cause is most likely suboptimal bit allocation. Finally, we describe the energy equalization quality metric (EEQM) for predicting the relative perceptual performance of the different coding algorithms and compare its predictive ability with that of ITU Recommendation ITU-R BS.1387-1.
IEEE Transactions on Speech and Audio Processing 06/2005; · 2.29 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: ITU-R BS.1387-1 gives a method for objective measurement of perceived audio quality known as PEAQ (perceptual evaluation of audio quality). This algorithm has been developed for measuring the quality of mid and high quality audio. We show that the advanced version of PEAQ performs poorly when compared to the previously developed energy equalization approach (EEA) for evaluating the quality of low bitrate scalable audio. We also show that including an energy equalization parameter as one of the model output variables (MOVs) of the advanced version improves its performance significantly; the performance of this modified version is superior to that of EEA.
Acoustics, Speech, and Signal Processing, 2005. Proceedings. (ICASSP '05). IEEE International Conference on; 04/2005 · 4.63 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: To reduce storage and transmission requirements, digital images are generally compressed in some fashion. Consequently, if one is interested in detecting and classifying spatial objects of interest within an image, it might, in many instances, be more efficient to do so in the compressed domain because less data would need to be processed and the computation required to decode the image would be avoided. In our earlier work, we have shown that object detection in the JPEG bitstream domain is both effective and efficient. In this paper, we expand on this earlier work, addressing issues such as detection within a constant false alarm rate (CFAR) context, detection over multiple frames, and multiple objects classification - all within the bitstream of a JPEG-compressed image.
Digital Signal Processing Workshop, 2004 and the 3rd IEEE Signal Processing Education Workshop. 2004 IEEE 11th; 09/2004
-
[show abstract]
[hide abstract]
ABSTRACT: Most image and video data is stored or transmitted after compression for efficiency. Processes like pattern detection and localization typically include the extra expense of decompressing the data since most image and video processing requires access to the original pixel value in the spatial domain. It may, however, be more computationally efficient if we directly process the compressed bit stream or operate in the intermediate transform domain without fully decompressing the data. Coders like SPHIT and JPEG 2000 are built around the wavelet transform, so in this paper we study detection and localization in the wavelet domain.
Signals, Systems and Computers, 2003. Conference Record of the Thirty-Seventh Asilomar Conference on; 12/2003
-
C.D. Creusere
[show abstract]
[hide abstract]
ABSTRACT: A scalably compressed bitstream is one which can be streamed and decoded at a wide variety of bitrates, and it is therefore compatible with communications channels of varying capacity. The audio coding portions of the MPEG 2 and 4 standards support fine-grained scalability through the use of bit slice arithmetic coding (BSAC). Human subjective analysis of BSAC, however, has shown that it performs poorly at low bitrates; seemingly random tonal patterns are superimposed on the actual audio. Here, we develop a new approach for objectively characterizing such distortion and validate it with human subjective trials. Unlike most other objective performance metrics, the proposed approach does not require sample-accurate sequence synchronization. As a comparison, we also apply the ITU-R BS.1387-1 objective testing recommendation to the same audio sequences and quantify how well it predicts the observed subjective quality.
Signals, Systems and Computers, 2003. Conference Record of the Thirty-Seventh Asilomar Conference on; 12/2003
-
C.D. Creusere
[show abstract]
[hide abstract]
ABSTRACT: We study coding artifacts in MPEG-compressed scalable audio. Specifically, we consider the MPEG advanced audio coder (AAC) using bit slice scalable arithmetic coding (BSAC) as implemented in the MPEG 4 reference software. First, we perform human subjective testing using the comparison category rating (CCR) approach, quantitatively comparing the performance of scalable BSAC with the nonscalable TwinVQ and AAC algorithms. This testing indicates that scalable BSAC performs very poorly relative to TwinVQ at the lowest bitrate considered (16 kb/s), largely because of an annoying and seemingly random mid-range tonal signal that is superimposed onto the desired output. In order to understand better and quantify perceptually the various forms of distortion introduced into compressed audio at low bit rates, we apply two analysis techniques: Reng probing and time-frequency decomposition. The Reng probing technique is capable of separating the linear time-invariant component of a multirate system from its nonlinear and periodically time-varying components. Using this technique, we conclude that aliasing is probably not the cause of the annoying tonal signal; instead, time-frequency analysis indicates that its cause is most likely suboptimal bit allocation.
Data Compression Conference, 2002. Proceedings. DCC 2002; 02/2002
-
[show abstract]
[hide abstract]
ABSTRACT: We study the problem of detecting and localizing objects that are embedded in compressed video sequences. Such a capability has two major and increasingly important practical uses: (1) video surveillance; (2) identification of copyright infringement. We focus here only on the problem of video surveillance. As a general rule, detection and localization of patterns is most efficiently performed in a reduced-dimensional subspace of the original object space. In this regard, it would be ideal to operate directly on the compressed bit stream. As a first step towards doing this, we consider here the problem of detecting and localizing video objects in the DCT domain (i.e., after the quantized DCT coefficients have been decoded but before the inverse DCT has been applied). We present comparisons between this DCT-based approach and the more conventional method in which object detection and localization is performed entirely in the spatial domain.
Signals, Systems and Computers, 2001. Conference Record of the Thirty-Fifth Asilomar Conference on; 02/2001
-
C.D. Creusere
[show abstract]
[hide abstract]
ABSTRACT: A rate-distortion-optimal embedded image compression algorithm is one in which each bit is generated by the encoder in such a way that the reconstruction error of the decoded image is reduced as much as possible when that bit is received. Popular embedded algorithms like set partitioning in hierarchical trees (SPIHT) use heuristic techniques to approximately achieve such optimality. Two other approaches have been developed recently that are rigorously optimized for rate-distortion within their respective algorithmic frameworks. In this work, we address the question of optimality within the framework of SPIHT, focusing specifically on the ordering of refinement and significance map information within the bit stream. From our experimental results, we conclude that SPIHT is almost optimal with respect to its ordering of these passes
Signals, Systems and Computers, 2000. Conference Record of the Thirty-Fourth Asilomar Conference on; 02/2000
-
[show abstract]
[hide abstract]
ABSTRACT: The perceptual evaluation of audio quality (PEAQ), an ITU metric, has been developed for objective measurement of high quality audio. Previously it has been shown that the energy equalization approach (EEA), and a metric that uses EEA as a model output variable (MOV) together with standard five MOVs of the PEAQ advanced version, outperforms over PEAQ metric alone for measuring low bitrate audio quality. In this paper, we show that the latter approach also performs better than both PEAQ advanced version and EEA alone for measuring mid quality audio. Further, the use of bitrate information in our metric improves its accuracy in measuring over different audio qualities, thereby making it scalable
Signals, Systems and Computers, 2005. Conference Record of the Thirty-Ninth Asilomar Conference on;
-
[show abstract]
[hide abstract]
ABSTRACT: Passive imaging polarimetry has emerged as an useful tool in many remote sensing applications including material classification, target detection and shape extraction. In this paper we present a method to classify specular objects based on their material composition from passive polarimetric imagery. The proposed algorithm is built on an iterative model-based method to recover the complex index of refraction of a specular target from multiple polarization measurements. The recovered parameters are then used to discriminate between objects by employing the nearest neighbor rule. The effectiveness of the proposed method is validated with data collected in laboratory conditions. Experimental results indicate that the classification approach is highly effective for distinguishing between various targets of interest. Most significantly, the proposed classification method is robust to a wide range of observational geometry.
Image Processing, 2007. ICIP 2007. IEEE International Conference on;