[Show abstract][Hide abstract] ABSTRACT: In this paper, we investigate an approach that computes salient points, i.e. areas of natural images that contain corners or edges, incrementally. We focus on the popular Harris corner detector and demonstrate how such an approach can operate when the image samples are refined in a bitwise manner, i.e. the image bitplanes are received one-by-one from the image sensor. This has the advantage that the image sensing and the salient point detection can be terminated at any input image precision (e.g. at a bound set by the sensory equipment or by computation, or by the salient point accuracy required by the application) and the obtained salient points under this precision are readily available. We estimate the required energy for image sensing as well as the computation required for the salient point detection and compare them against the conventional salient point detector realization that operates directly on each source precision and cannot refine the result. Our experiments demonstrate the feasibility of incremental approaches for salient point detection in various classes of natural images. In addition, a first comparison between the results obtained by the intermediate detectors is presented.
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2008, March 30 - April 4, 2008, Caesars Palace, Las Vegas, Nevada, USA; 01/2008
[Show abstract][Hide abstract] ABSTRACT: Multihop networks provide a flexible infrastructure that is based on a mixture of existing access points and stations interconnected via wireless links. These networks present some unique challenges for video streaming applications due to the inherent infrastructure unreliability. In this paper, we address the problem of robust video streaming in multihop networks by relying on delay- constrained and distortion-aware scheduling, path diversity, and retransmission of important video packets over multiple links to maximize the received video quality at the destination node. To provide an analytical study of this streaming problem, we focus on an elementary multihop network topology that enables path diversity, which we term "elementary cell." Our analysis is considering several cross-layer parameters at the physical and medium access control (MAC) layers, as well as application-layer parameters such as the expected distortion reduction of each video packet and the packet scheduling via an overlay network infrastructure. In addition, we study the optimal deployment of path diversity in order to cope with link failures. The analysis is validated in each case by simulation results with the elementary cell topology, as well as with a larger multihop network topology. Based on the derived results, we are able to establish the benefits of using path diversity in video streaming over multihop networks, as well as to identify the cases where path diversity does not lead to performance improvements.
IEEE Transactions on Mobile Computing 01/2008; 6(12):1343-1356. · 2.91 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: State-of-the-art vehicles are now being equipped with multiple video channels for video-data transmission from multiple surveillance cameras mounted on the automobile, navigation videos reporting the traffic conditions on the planned route, as well as entertainment-multimedia streaming for passengers watching on rear-seat monitors. Wireless LANs provide a low-cost and flexible infrastructure for these emerging in-vehicle multimedia services aimed at the driver's and passengers' safety, convenience, and entertainment. To enable the successful simultaneous deployment of such applications over in-vehicle wireless networks, we propose delay-sensitive streaming and packet-scheduling algorithms that enable simple, flexible, and efficient adaptation of the video bitstreams to the instantaneously changing video source and wireless-channel characteristics while complying with the a priori negotiated quality-of-service (QoS) parameters for that video service. Our focus is on real-time low-cost solutions for multimedia transmission over in-vehicle wireless networks that are derived based on existing protocols defined by QoS-enabled networks, such as the IEEE 802.11e standard. In addition, the aim of this paper is to couple the proposed solutions with a novel multitrack-hinting method that is proposed as an extension of conventional MP4 hint tracks in order to provide real-time adaptation of multimedia streams to multiple quality levels for different in-vehicle applications, depending on their importance and delay constraints. First, the scheduling constraints for these simultaneous wireless video-streaming sessions are analytically expressed as a function of the negotiated QoS parameters. This is imperative because a video stream received from an in-vehicle road-surveillance camera will have a different set of delay and quality constraints in comparison to that of traffic monitoring received from remote video cameras located on the planned route. Hence, transmission paramete-
rs, such as peak data rate, maximum burst size, minimum transmission delay, maximum error rate, etc., will differ for the various video streams. For this reason, new low-complexity packet-scheduling algorithms that can fulfill diverse QoS streaming conditions are proposed and analyzed. The proposed algorithms produce viable schedules (i.e., strictly QoS-compliant) that jointly consider the delay constraints and the in-vehicle video-receiver-buffer conditions. Hence, these scheduling schemes can completely avoid the underflow or overflow event of the receiving-device buffer while guaranteeing the agreement between the real-time video traffic and the predetermined traffic specification reached during QoS negotiation for various in-vehicle video channels. When combined with multitrack hinting, an integrated flexible system for adaptive multimedia streaming over QoS-enabled in-vehicle wireless networks can be constructed. We demonstrate the viability of the proposed scheduling mechanisms experimentally by using real video traces under multiple quality levels, as derived by the multitrack-hinting design. In addition, simulations under realistic conditions are also performed to validate the ability of the method to satisfy buffer-occupancy constraints.
IEEE Transactions on Vehicular Technology 12/2007; · 2.64 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Current systems often assume "worst case" resource utilization for the design and implementation of compression techniques and standards, thereby neglecting the fact that multimedia coding algorithms require time-varying resources, which differ significantly from the "worst case" requirements. To enable adaptive resource management for multimedia systems, resource-estimation mechanisms are needed. Previous research demonstrated that online adaptive linear prediction techniques typically exhibit superior efficiency to other alternatives for resource prediction of multimedia systems. In this paper, we formulate the problem of adaptive linear prediction of video decoding resources by analytically expressing the possible adaptation parameters for a broad class of video decoders. The resources are measured in terms of the time required for a particular operation of each decoding unit (e.g., motion compensation or entropy decoding of a video frame). Unlike prior research that mainly focuses on estimation of execution time based on previous measurements (i.e., based on autoregressive prediction or platform and decoder-specific off-line training), we propose the use of generic complexity metrics (GCMs) as the input for the adaptive predictor. GCMs represent the number of times the basic building blocks are executed by the decoder and depend on the source characteristics, decoding bit rate, and the specific algorithm implementation. Different GCM granularities (e.g., per video frame or macroblock) are explored. Our previous research indicated that GCMs can be measured or modeled at the encoder or the video server side and they can be streamed to the decoder along with the compressed bitstream. A comparison of GCM-based versus autoregressive adaptive prediction over a large range of adaptation parameters is performed. Our results indicate that GCM-based prediction is significantly superior to the autoregressive approach and also requires less computational resources at the de-
coder. As a result, a novel resource-prediction tradeoff is explored between: 1) the communication overhead for GCMs and/or the implementation overhead for the realization of the predictor and 2) the improvement of the prediction performance. Since this tradeoff can be significant for the decoder platform (either from the communication or the implementation perspective), we propose complexity (or communication)-bounded adaptive linear prediction in order to derive the best resource estimation under the given implementation (or GCM-communication) bound
IEEE Transactions on Circuits and Systems for Video Technology 07/2007; · 2.26 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: In scalable or layered video coding, bitstreams that fulfil specific rate or distortion profiles can be derived (shaped) postencoding, i.e., at transmission time. This can also be a very effective method for real-time adjustment of the decoding complexity to bounds specified by the decoding processor or operating system. In this paper, a new type of bitstream shaping is discussed, which is able to fulfil various instantaneous complexity profiles/constraints imposed by the decoder. Complexity is estimated by modeling generic metrics that can be translated into receiver-specific execution-time or energy-consumption estimates. For various bitstream-shaping alternatives, we associate their corresponding complexity estimates with the decoder modules via a decomposition into complexity coefficients and complexity functions. This process drives bitstream shaping according to joint distortion and complexity constraints, while the proposed model incurs limited online computational and storage overhead. In addition, our experiments demonstrate that the proposed method that is based on offline training can outperform various online training methods commonly used for resource prediction in the literature. For each group of pictures, our bitstream-shaping experiments indicate that the accuracy of the proposed method permits adaptation to real-time imposed constraints on the complexity metrics within a 10% error margin
IEEE Transactions on Signal Processing 06/2007; · 3.20 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Analytical modeling for video coders can be used in a variety of scenarios where information concerning rate, distortion or complexity is essential for driving system or network interactions with media algorithms. While rate and distortion modeling have been covered extensively in previous works, complexity is not well addressed because it is highly algorithm dependent and hence difficult to model. Based on a stochastic modeling framework for the transform coefficients, we present a novel complexity analysis for state-of-the-art wavelet video coding methods by explicitly modeling several aspects found in operational coders, i.e. embedded quantization and quadtree decompositions of block significance maps. The proposed modeling derives for the first time analytical estimates of the expected number of operations (complexity) for a broad class of wavelet video coders based on stochastic source models, coding algorithm and system parameters.
Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on; 05/2007
[Show abstract][Hide abstract] ABSTRACT: This correspondence presents a novel method for the construction of translation-invariant discrete wavelet packet (TIDWP) transforms for any decomposition level k, starting from any phase of a critically sampled discrete wavelet-packet representation of level k. The process is performed by phase shifting, i.e., the direct recovering of the wavelet coefficients omitted by the downsampling operations of each decomposition level without reconstructing the input signal. The proposed method enables tradeoffs between memory utilization and computational efficiency for the construction of translation-invariant representations. Hence, it is useful in resource-constrained TIDWP-based applications of digital signal compression, image segmentation and detection of transients
IEEE Transactions on Signal Processing 02/2007; · 3.20 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: A plethora of coding and streaming mechanisms have been proposed for real-time multimedia transmission over the Internet. However, most proposed mechanisms rely only on global (e.g. based on end-to-end measurements), delayed (at least by the round-trip-time), or statistical (often based on simplistic network models) information available about the network state. Based on recently-proposed state-of-the-art open-loop video coding schemes, we propose a new integrated streaming and routing framework for robust and efficient video transmission over networks exhibiting path failures. Our approach explicitly takes into account the network dynamics, path diversity, and the modeled video distortion at the receiver side to optimize the packet redundancy and scheduling. In the derived framework, multimedia streams can be adapted dynamically at the video server based on instantaneous routing-layer information or failure-modeling statistics. The performance of our integrated application and network-layer method is simulated against equivalent approaches that are not optimized based on routing-layer feedback and distortion modeling, and the obtained gains in video quality are quantified
IEEE Transactions on Multimedia 01/2007; · 1.78 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: When it comes to the computation of the 2D discrete wavelet transform (DWT), three major computation schedules have been proposed, namely the row-column, the line-based and the block-based. In this work, the lifting-based designs of these schedules are implemented on FPGA-based platforms to execute the forward 2D DWT, and their comparison is presented. Our implementations are optimized in terms of throughput and memory requirements, in accordance with the specifications of each one of the three computation schedules and the lifting decomposition. All implementations are parameterized with respect to the image size and the number of decomposition levels. Experimental results prove that the suitability of each implementation for a particular application depends on the given specifications, concerning the throughput and the hardware cost
Field Programmable Technology, 2006. FPT 2006. IEEE International Conference on; 01/2007
[Show abstract][Hide abstract] ABSTRACT: The proliferation of wireless multihop communication infrastructures in office or residential environments depends on their ability to support a variety of emerging applications requiring real-time video transmission between stations located across the network. We propose an integrated cross-layer optimization algorithm aimed at maximizing the decoded video quality of delay-constrained streaming in a multihop wireless mesh network that supports quality-of-service. The key principle of our algorithm lays in the synergistic optimization of different control parameters at each node of the multihop network, across the protocol layers-application, network, medium access control, and physical layers, as well as end-to-end, across the various nodes. To drive this optimization, we assume an overlay network infrastructure, which is able to convey information on the conditions of each link. Various scenarios that perform the integrated optimization using different levels ("horizons") of information about the network status are examined. The differences between several optimization scenarios in terms of decoded video quality and required streaming complexity are quantified. Our results demonstrate the merits and the need for cross-layer optimization in order to provide an efficient solution for real-time video transmission using existing protocols and infrastructures. In addition, they provide important insights for future protocol and system design targeted at enhanced video streaming support across wireless mesh networks
IEEE Journal on Selected Areas in Communications 12/2006; · 4.14 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The quality-of-service (QoS) guarantees enabled by the new IEEE 802.11 a/e Wireless LAN (WLAN) standard are specifically targeting the real-time transmission of multimedia content over the wireless medium. Since video data consume the largest part of the available bitrate compared to other media, optimization of video streaming for this new standard is a significant factor for the successful deployment of practical systems. Delay-constrained streaming of fully-scalable video over IEEE 802.11 a/e WLANs is of great interest for many multimedia applications. The new medium access control (MAC) protocol of IEEE 802.11e is called the Hybrid Coordination Function (HCF) and, in this paper, we will specifically consider the problem of video transmission over HCF Controlled Channel Access (HCCA). A cross-layer optimization across the MAC and application layers of the OSI stack is used in order to exploit the features provided by the combination of the new HCCA standard with new versatile scalable video coding algorithms. Specifically, we propose an optimized and scalable HCCA-based admission control for delay-constrained video streaming applications that leads to a larger number of stations being simultaneously admitted (without quality reduction to any video flow). Subsequently, given the allocated transmission opportunity, each station deploys an optimized Application-MAC-PHY adaptation, scheduling, and protection strategy that is facilitated by the fine-grain layering provided by the scalable bitstream. Given that each video flow needs to always comply with the predetermined (a priori negotiated) traffic specification parameters, this cross-layer strategy enables graceful quality degradation whenever the channel conditions or the video sequence characteristics change. For instance, it is demonstrated that the proposed cross-layer protection and bitstream adaptation strategies facilitate QoS token rate adaptation under link adaptation mechanisms that utilize different physical layer transmission rates. The expected gains offered by the optimized solutions proposed in this paper are established theoretically, as well as through simulations.
IEEE Transactions on Mobile Computing 07/2006; 5(6):755- 768. · 2.91 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Several input-traversal schedules have been proposed for the computation of the 2D discrete wavelet transform (DWT). In this paper, the row-column, the line-based and the block-based schedules for the 2D DWT computation are compared with respect to their execution time on a very long instruction word (VLIW) digital signal processor (DSP). Implementations of the wavelet transform according to the considered schedules have been developed. They are parameterized with respect to filter pair, image size, and number of decomposition levels. All implementations have been mapped on a VLIW DSP. Performance metrics for the implementations for a complete set of parameters have been obtained and compared. The experimental results show that each implementation performs better for different points of the parameter space
Circuits and Systems, 2006. ISCAS 2006. Proceedings. 2006 IEEE International Symposium on; 06/2006
[Show abstract][Hide abstract] ABSTRACT: Delay-constrained streaming of fully-scalable video over IEEE 802.11a/e wireless (WLANs) is of great interest for many emerging multimedia applications. In this paper, we consider the problem of video transmission over HCF controlled channel access (HCCA), which is part of the new medium access control (MAC) protocol of IEEE 802.11e. A cross-layer optimization across the MAC and application layers is used in order to exploit the features provided by the new HCCA standard, as well as by the versatility of new state-of-the-art scalable video coding algorithms. Under pre-determined delay constraints for streaming, the proposed cross-layer strategy leads to a larger number of stations being simultaneously admitted (without any loss in the video quality) than in systems that utilize application-layer only optimizations. At the same time, the fine-grain layering provided by the scalable bitstream facilitates prioritization and unequal retransmissions of packets at the MAC layer thereby enabling graceful quality degradation under channel-capacity limitations and delay constraints. The expected gains offered by the optimized solutions proposed in this paper are established through simulations
Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on; 06/2006
[Show abstract][Hide abstract] ABSTRACT: We propose an integrated cross-layer optimization algorithm for maximizing the decoded video quality of delay-constrained streaming in a quality-of-service (QoS) enabled multi-hop wireless mesh network. The key to our algorithm is the synergistic optimization of control parameters at each node of the multi-hop network, across the protocol layers - application, network, medium access control (MAC) and physical (PHY) layers, as well as end-to-end, i.e. across the various network nodes. To drive this optimization, we assume an overlay network infrastructure, which conveys information on the conditions of each link. Quantitative results are presented that demonstrate the merits and the need for cross-layer optimization in an efficient solution for real-time video transmission using existing protocols and infrastructures
Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on; 06/2006
[Show abstract][Hide abstract] ABSTRACT: The two-dimensional discrete wavelet transform (2D DWT) is becoming one of the standard tools for image and video compression systems. Various input-traversal schedules have been proposed for its computation. Here, major schedules for 2D DWT computation are compared with respect to their performance on a very long instruction-word (VLIW) digital signal processor (DSP). In particular, three popular transform-production schedules are considered: the row-column, the line based and the block based. Realisations of the wavelet transform according to the considered schedules have been developed. They are parameterised with respect to filter pair, image size and number of decomposition levels. All realisations have been mapped on a VLIW DSP, as these processors currently form an attractive alternative for the realisation of signal, image and video processing systems. Performance metrics for the realisations for a complete set of parameters have been obtained and compared. The experimental results show that each realisation performs better for different points of the parameter space.
IEE Proceedings - Vision Image and Signal Processing 05/2006;
[Show abstract][Hide abstract] ABSTRACT: Existing research on Universal Multimedia Access has mainly focused on adapting multimedia to the network characteristics while overlooking the receiver capabilities. Alternatively, part 7 of the MPEG-21 standard entitled Digital Item Adaptation (DIA) defines description tools to guide the multimedia adaptation process based on both the network conditions and the available receiver resources. In this paper, we propose a new and generic rate-distortion-complexity model that can generate such DIA descriptions for image and video decoding algorithms running on various hardware architectures. The novelty of our approach is in virtualizing complexity, i.e., we explicitly model the complexity involved in decoding a bitstream by a generic receiver. This generic complexity is translated dynamically into "real" complexity, which is architecture-specific. The receivers can then negotiate with the media server/proxy the transmission of a bitstream having a desired complexity level based on their resource constraints. Hence, unlike in previous streaming systems, multimedia transmission can be optimized in an integrated rate-distortion-complexity setting by minimizing the incurred distortion under joint rate-complexity constraints.
IEEE Transactions on Multimedia 07/2005; · 1.78 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: A number of emerging resolution-scalable image and video coding algorithms have recently shown very promising performance due to the use of overcomplete wavelet representations. In these applications, the overcomplete discrete wavelet transform (ODWT) is derived starting from the critically-sampled subbands of the DWT (complete representation) of a certain decomposition (resolution) level. This process, which is a complete-to-overcomplete DWT (CODWT), is used for wavelet domain operations that require shift invariance. Specifically, both the encoder and decoder independently construct the overcomplete representation at the best accuracy possible, given the critically-sampled subbands of a certain resolution level. In contrast to the classical approach for performing the CODWT, which is a multi-rate calculation scheme that requires the reconstruction of the input spatial-domain signal, in this paper we propose a new, single-rate calculation scheme, which is formalized for the general case of an arbitrary decomposition (resolution) level. Based on derived symmetry properties, a simple implementation structure of the proposed approach provides interesting tradeoffs for the required multiplication budget in comparison to the conventional approach. This leads to a complexity-scalable solution that fits the versatile requirements of scalable coding environments. The use of the proposed single-rate calculation scheme of the CODWT is demonstrated in several image and video coding systems.
[Show abstract][Hide abstract] ABSTRACT: A new transform is proposed that derives the over- complete discrete wavelet transform (ODWT) subbands from the critically sampled DWT subbands (complete representation). This complete-to-overcomplete DWT (CODWT) has certain advantages in comparison to the conventional approach that performs the in- verse DWT to reconstruct the input signal, followed by the à-trous or the lowband shift algorithm. Specifically, the computation of the input signal is not required. As a result, the minimum number of downsampling operations is performed and the use of upsampling is avoided. The proposed CODWT computes the ODWT subbands by using a set of prediction-filter matrices and filtering-and-downsampling operators applied to the DWT. This formulation demonstrates a clear separation between the single-rate and multirate compo- nents of the transform. This can be especially significant when the CODWT is used in resource-constrained environments, such as resolution-scalable image and video codecs. To illustrate the applicability of the proposed transform in these emerging appli- cations, a new scheme for the transform-calculation is proposed, and existing coding techniques that benefit from its usage are surveyed. The analysis of the proposed CODWT in terms of arith- metic complexity and delay reveals significant gains as compared with the conventional approach.
IEEE Transactions on Signal Processing 04/2005; 53:1398-1412. · 3.20 Impact Factor