Conference Paper

Using offline bitstream analysis for power-aware video decoding in portable devices

DOI: 10.1145/1101149.1101209 Conference: Proceedings of the 13th ACM International Conference on Multimedia, Singapore, November 6-11, 2005
Source: DBLP


Dynamic voltage/frequency scheduling algorithms for multimedia applications have recently been a subject of intensive research. Many of these algorithms use control-theoretic feedback techniques to predict the future execution demand of an application based on the demand in the recent past. Such techniques suffer from two major disadvantages: (i) they are computationally expensive, and (ii) it is difficult to give performance or quality-of-service guarantees based on these techniques (since the predictions can occasionally turn out to be incorrect). To address these shortcomings, in this paper we propose a completely new approach for dynamic voltage and frequency scaling. Our technique is based on an offline bitstream analysis of multimedia files. Based on this analysis, we insert metadata information describing the computational demand that will be generated when decoding the file. Such bitstream analysis and metadata insertion can be done when the multimedia file is being downloaded into a portable device from a desktop computer. In this paper we illustrate this technique using the MPEG-2 decoder application. We show that the amount of metadata that needs to be inserted is a very small fraction of the total size of the video clip and it can lead to significant energy savings. The metadata inserted will typically consist of the frequency value at which the processor needs to be run at different points in time during the decoding process. Lastly, in contrast to runtime prediction-based techniques, our scheme can be used to provide performance and quality-of-service guarantees and at the same time avoids any runtime computation overhead.

Download full-text


Available from: Ye Wang, Mar 22, 2014
  • Source
    • "A straightforward way to compute these workload values uses time consuming simplescalar simulations. In this work, motivated by a workload model for MPEG-2 decoder tasks presented in [14], we propose a fast model-based performance analysis method which integrates our workload model of the decoder tasks with a performance model (using VCCs) of the MP- SoC architecture, thereby providing a fast and efficient clustering of the video clips. Here, simplescalar simulations to obtain workload values for each task is completely avoided and bitstream analysis (incorporating our MPEG-2 workload model) is used instead. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Currently, performance analysis of multimedia-MPSoC platforms largely rely on simulation. The execution of one or more applications on such a platform is simulated for a library of test video clips. If all specified performance constraints are satisfied for this library, then the architecture is assumed to be well-designed. This is similar to testing software for functional correctness. However, in contrast to functional testing, simulating a set of video clips for a complex application/architecture is extremely time consuming. In this paper we propose a technique for clustering a library of video clips, such that it is sufficient to simulate only one clip from each cluster rather than the entire library. Our clustering is scalable, i.e., the number of clusters may be determined based on the number of clips that the system designer wishes to simulate (which is independent of the input library size). For each video clip in the library, we perform a fast bitstream analysis from which the workload generated while processing this clip on the given architecture may be estimated. This workload information, in conjunction with a workload model and a performance model of the architecture, is used for the clustering. This entire process does not involve any simulation and is hence extremely fast. We illustrate its utility through a detailed case study using an MPEG-2 decoder application running on an MPSoC platform. As part of validation of our methodology, it was observed that video clips falling into the same cluster exhibit similar worst case buffer backlogs and worst case delays for one macroblock. Overall the results demonstrate that the proposed method provides a very fast and accurate analysis and hence can be of significant benefit to the system designer.
    Full-text · Conference Paper · Jan 2009
  • Source
    • "What is more, video data is compressed by variablebit-rate (VBR) compression techniques, so exhibit a high degree of data variability. For example, the ratio of the maximum and the average bit-rate of some MPEG videos can be as high as a factor of 10 [5]. These characteristics make timely transition of the power modes difficult, which may cause the playback to be distorted or stopped. "
    [Show abstract] [Hide abstract]
    ABSTRACT: To support the large storage requirements, consumer electronics for video playback are increasingly being equipped with hard disk drives (HDD) that consume a significant amount of energy. A video player may prefetch many frames to give disk an opportunity to go to standby mode, but this may cause playback to be distorted or stopped if timely power mode transitions are not incorporated. We present the design, implementation and evaluation of a data prefetching scheme for energy-aware video data retrieval for portable media players (PMP). We formulate the problem when the prefetching is used for variable-bit-rate (VBR) streams to reduce disk energy consumption and then propose a new energy-aware data retrieval scheme that prefetches video data in a just-in-time way so as to increase the period in which disk stays in standby mode while guaranteeing the real-time service. We implemented our scheme in the legacy video player called Mplayer that is typically used for Linux-based consumer devices. Experimental results show that it saves energy as much as 51% compared with conventional schemes.
    Preview · Article · Dec 2008 · IEEE Transactions on Consumer Electronics
  • Source
    • "For example, counting the macroblocks and their types leads to an accurate characterization of the decode complexity of the H.264 AVC video decoder [7] [9] [16]. At the other end of the spectrum, hardware-dependent characterization determines a frame's decode complexity [4] [5] [6] [10] [11] [15]. This is typically done by comparing the decode time at a nominal clock frequency and voltage level, versus the per-frame deadline. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Scenario-based design exploits the time-varying execution behavior of applications by dynamically adapting the sys- tem on which they run. This is a particularly interesting design methodology for media applications with soft real- time constraints such as decoders: frames can be classified into scenarios based on their decode complexity, and the system can be configured on a per-scenario basis such that energy consumption is reduced while still meeting the dead- lines. At the foundation of scenario-based design lies the ability to identify scenarios, or recurring modes of opera- tion with similar run time characteristics. There are two opposite ends to scenario identification. Some researchers have proposed techniques that, based on domain knowledge, identify hardware-independent scenarios in a media input stream. At the other end, other researchers have proposed techniques that identify hardware-dependent scenarios in a (semi-)automated way. This paper proposes a scenario identification approach that bridges both opposite ends, and finds hardware-inde- pendent scenarios in an automated way. It does so by com- puting execution profiles on a per-frame basis that capture the application's code execution patterns. We find that Edge Vectors (EVs) are more accurate than Basic Block Vectors (BBVs) at capturing the variation in frame-level de- code complexity. The complexity of the proposed automated scenario identification is comparable to existing hardware- dependent scenario identification approaches, yet the sce- narios can be used across hardware implementations.
    Preview · Conference Paper · Jan 2008
Show more