Conference Paper

Scalable to lossless audio compression based on perceptual set partitioning in hierarchical trees (PSPIHT)

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The paper proposes a technique for scalable to lossless audio compression. The scheme presented is perceptually scalable and also provides for lossless compression. It produces smooth objective scalability, in terms of SegSNR, from lossy to lossless compression. The proposal is built around the introduced perceptual SPIHT algorithm, which is a modification of the SPIHT algorithm. Both objective and subjective results are reported and demonstrate both perceptual and objective measure scalability. The subjective results indicate that the proposed method performs comparably with the MPEG-4 AAC coder at 16, 32 and 64 kbps, yet also achieves a scalable-to-lossless architecture.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The ability to smoothly scale from narrower bandwidth signals to wider bandwidth signals with different quantization resolution is also of interest, as pointed out in [8]. In this paper, we present an extension of the scalable audio coder presented in [9] to allow for the expansion of bandwidth as well as increase in quantization resolution. The scheme presented in [9] allows very fine granular scalability as well as competitive compression at the lossless stage across different bandwidths and quantization resolutions. ...
... In this paper, we present an extension of the scalable audio coder presented in [9] to allow for the expansion of bandwidth as well as increase in quantization resolution. The scheme presented in [9] allows very fine granular scalability as well as competitive compression at the lossless stage across different bandwidths and quantization resolutions. The compression scheme is built around transform coding of audio, similar to [4], [6] and [5]. ...
... The scalable to lossless scheme presented in [9] is the basis upon which the bandwidth scalable coder is built, as such it will be described first. Figure 1 illustrates the PSPIHT scalable to lossless scheme. ...
Conference Paper
Full-text available
This paper extends a scalable to lossless compression scheme to allow scalability in terms of sampling rate as well as quantization resolution. The scheme presented is an extension of a perceptu- ally scalable scheme that scales to lossless compression, producing smooth objective scalability, in terms of SNR, until lossless com- pression is achieved. The scheme is built around the Perceptual SPIHT algorithm, which is a modification of the SPIHT algorithm. An analysis of the expected limitations of scaling across sampling rates is given as well as lossless compression results showing the competitive performance of the presented technique.
... Perceptual quality at the intermediate bit-rates is maintained by an iterative post processing that reshapes the spectral envelope of the reconstructed audio to resemble that of the original one. In [5] and [6], a frequency domain FGS coding approach is adopted. In [5], the perceptual set partitioning in hierarchical trees (PSPIHT) coding procedure, which is modified from the well-known SPIHT algorithm in image coding [8], is performed on the Modified Discrete Cosine Transform (MDCT) coefficients of the audio signal to generate the FGS bit-stream and perceptual scalability is ensured by coding the MDCT coefficients in the order of their perceptual significance. ...
... In [5] and [6], a frequency domain FGS coding approach is adopted. In [5], the perceptual set partitioning in hierarchical trees (PSPIHT) coding procedure, which is modified from the well-known SPIHT algorithm in image coding [8], is performed on the Modified Discrete Cosine Transform (MDCT) coefficients of the audio signal to generate the FGS bit-stream and perceptual scalability is ensured by coding the MDCT coefficients in the order of their perceptual significance. Since the MDCT coefficients have been rounded to its nearest integers in PSPIHT, an additional residual coding layer is added to code the error generated due to this rounding operation in the time domain. ...
... Firstly, AAZ adopts the IntMDCT based transform lossless coding approach, which provides a simple and efficient framework to achieve a scalable to lossless audio coder, and helps to embed the MDCT based MPEG-4 AAC system in a rather straightforward manner. This also reduces the overhead of an additional residual coding layer [5] in the time domain which is generally inevitable if a noninteger transform is used. Secondly, to efficiently embed the AAC core, a novel error-mapping process is used so that the FGS coding is performed on the residual signal after removing the information that is already coded in the AAC core from the IntMDCT spectral data. ...
Article
Full-text available
This paper presents Advanced Audio Zip (AAZ), a fine grained scalable to lossless (SLS) audio coder that has recently been adopted as the reference model for MPEG-4 audio SLS work. AAZ integrates the functionalities of high-compression perceptual audio coding, fine granular scalable audio coding, and lossless audio coding in a single framework, and simultaneously provides backward compatibility to MPEG-4 Advanced Audio Coding (AAC). AAZ provides the fine granular bit-rate scalability from lossy to lossless coding, and such a scalability is achieved in a perceptually meaningful way, i.e., better perceptual quality at higher bit-rates. Despite its abundant functionalities, AAZ only introduces negligible overhead in terms of lossless compression performance compared with a nonscalable, lossless only audio coder. As a result, AAZ provides a universal yet efficient solution for digital audio applications such as audio archiving, network audio streaming, portable audio playing, and music downloading which were previously catered for by several different audio coding technologies, and eliminates the need for any transcoding system to facilitate sharing of digital audio contents across these application domains
... Therefore, it is very desirable and attractive to construct a scalable codec with both fine scalable granularity and competitive efficiency. Recently, several works addressed the issue [2,3,4,5,6,7,8,9,10] by proposing fine-grain scalable audio compression schemes us-ing the techniques of both ordered bitplane coding and tree-based significance mapping. The basic idea herein is to encode the transformed coefficients by frames. ...
... In particular, N = 4 was adopted in [2,5,6,7] for the MDCT transform, and N = 2 was used in [3,4] for the wavelet packet transform. These significance tree choices, in nature, are rather arbitrary. ...
Article
Full-text available
To address the fine-grain scalable audio compression issue, a novel combined significance tree technique is proposed for high compression efficiency. The core idea is to dynamically adopt a set of locally optimal significance trees, instead of following the common approach of using a single type of tree. Two different encoding strategies are proposed: the spectral coefficients can be encoded either in a threshold-by-threshold manner or in a segment-by-segment manner. The former yields rate and fidelity scalability, and the latter yields bandwidth scalability. Experimental results show that our proposed scheme significantly outperforms the existing schemes using single-type trees and performs comparably with the MPEG AAC coder while achieving fine-grain scalability.
... Most existing fine grain scalable audio coders (e.g. BSAC [5], PSPIHT [6], SCALA [7]) are transform-based : the signal is decomposed in an orthogonal basis of time-frequency functions (e.g. MDCT) and the resulting coefficients are quantized and coded. ...
... While the refinement pass is straightforward and is generally the same in all algorithms, the significance map encoding can be achieved in several different manners. In audio coding, several approaches have been experimented : arithmetic coding based [5], tree based [6], runlength encoding based [7]. We use the approach of [7], based on an adaptive runlength encoding algorithm which appears to perform well on sparse significance maps. ...
Conference Paper
Full-text available
Signal representations in overcomplete dictionaries are considered here as an alternative to the traditional transform representations for fine-grain scalable audio coding. Such representations produce sparser decompositions and thus allow better coding efficiency than transform coding at very low bitrates. Moreover, the decomposition algorithms are intrinsically progressive, and flexible enough to allow an efficient transient modeling. We propose in this paper a fine-grain scalable audio coder which works on a large range of bitrates (2kbs to 128kbs). Objective measures as well as informal subjective evaluation show that this coder outperforms a comparable transform-based coder at very low bitrates.
... with the dimension of the sample space and , the number of cells for quantizer . The output of the quantizer is a fixed-length -bits binary code with (2) The quantizers are said to be embedded if there exists fixedlength binary codes such that for and (3) In other words, the quantizers are embedded if they produce a sequence of embedded binary codes: higher rate codes contain lower rate codes plus bits of refinement. Embedded quantizers find useful applications in progressive transmission of information. ...
... In the case of embedded coding, a bitstream at maximum resolution is stored; users with less bandwidth just need to stop the reception at any time, yet they can still receive the file with the best quality given their bandwidth. Recent work [3] focused on applying the SPIHT algorithm to MDCT-based audio coding. Transform coding gives excellent results at high rates, but at low rates, the quality decreases significantly, and thus, parametric coding performs better. ...
Article
Embedded polar quantization can be useful for progressive transmission of circularly symmetric data, e.g., for fine-grain scalable coding of parametric audio. Sets of constrained-resolution embedded quantizers are built recursively by successive refinement processes, that are detailed for strict polar quantization and unrestricted polar quantization. The quadratic error minimization problem is solved using equations similar to those of Max, and the refinement algorithm can, in the unrestricted case, be simplified using a high-rate approximation. For Gaussian data, comparisons with reference non-embedded quantizers show that the embedding property comes at an often negligible cost in terms of rate-distortion performance.
... Image coders such as EZW [Sha93] and SPIHT [SP96] use a bitplane encoding algorithm with a significance pass that uses the tree structure inherent of the wavelet transform. Though this tree structure is less evident for a MDCT transform, similar bitplane encoding algorithms as SPIHT have been applied to MDCT-based audio coding [RMB02,RMB03,SZM05]. Another way to perform the significance pass is to use arithmetic coding to code the bits in BS; this is the technique chosen by the MPEG-4 standards BSAC [PKS97] and SLS [YRXK06]. ...
Article
This thesis investigates new signal representations for audio coding. Existing state-of-the-art audio coders are based either on a transform (transform coding), or on a parametric model (parametric coding), or on a combination of both (hybrid coding). On the one hand, transform coding achieves (near-)transparent quality at high bitrates (e.g. AAC at 64kbps/channel), but gives poor performance at lower bitrates. On the other hand, parametric and hybrid coding achieve better performance than transform coding at low bitrates but cannot give transparent quality at high bitrates. The new approach for signal representation that we propose allows to achieve transparent quality at high bitrates, while giving better performance than transform coding at low bitrates. This signal representation is based on an overcomplete set of time-frequency functions composed by a union of several MDCT bases with different scales. The first major contribution of this thesis is a fast and efficient algorithm that decomposes a signal into this overcomplete set of functions. The second major contribution of this thesis is a set of techniques that allows the coding of these representations in an efficient and scalable way. Finally, this thesis investigates the application to audio indexing. We show that using a union of several MDCT bases allows to go beyond the limitations of the representations used in the transform coders (particularly the frequency resolution), which makes possible an efficient indexing in the transform domain.
... Direct comparisons of the average number of significant coefficients nsig coded across the bitrate range 32 -96 kb/s (sampled in steps of 8 kb/s) indicates SPIHT suffers an equivalent bitrate penalty of 9% relative to AAC (Table 1). This result is consistent with previously reported subjective comparisons between AAC and SPIHT [16]. ...
Article
Low-complexity audio compression offering fine-grain bitrate scalability can be realised with bitplane runlength coding. Adaptive Golomb codes are computationally simple runlength codes that allow bitplane runlength coding to achieve notable coding efficiency. For multi-block audio frames, coefficient interleaving prior to bitplane runlength coding results in a substantial increase in coding efficiency. It is shown that bitplane runlength coding is more compact than the best known SPIHT arrangement for audio coding, and achieves coding efficiency that is competitive with fixed-rate quantisation.
... 1b with the fixed parent-children relationship O(i) = iN + {0, 1, · · · , N − 1} for different positive integers N . For the MDCT transform, N = 4 was adopted in [9, 10, 11, 12] and the wavelet packet transform was encoded using N = 2 in [13, 14]. This type of tree will be referenced in the following as SPIHT-style significance trees. ...
Article
Full-text available
A fine-grain scalable and efficient compression scheme for sparse data based on adaptive significance-trees is presented. Com-mon approaches for 2-D image compression like EZW (embed-ded wavelet zero tree) and SPIHT (set partitioning in hierarchi-cal trees) use a fixed significance-tree that captures well the inter-and intraband correlations of wavelet coefficients. For most 1-D signals like audio, such rigid coefficient correlations are not present. We address this problem by dynamically selecting an op-timal significance-tree for the actual data frame from a given set of possible trees. Experimental results on sparse representations of audio signals are given, showing that this coding scheme outper-forms single-type tree coding schemes and performs comparable to the MPEG AAC coder while additionally achieving fine-grain scalability.
... overcome this drawback, several approaches for a certain bitrate scalability with varying quality have been presented (e.g. [1, 2, 3, 4, 5, 6]). For low bitrates, say, below 64 kbps, the perceived quality for both, scalable and fixed target bitrate coders, decreases markedly. ...
Conference Paper
Full-text available
We present strategies for perceptual improvements of embedded audio coding based on psychoacoustic weighting and spectral envelope restoration. The encoding schemes exhibit fine-grain bitrate scalability via the set partitioning in hierarchical trees (SPIHT) algorithm. Weighting factors and envelope parameters are transmitted under careful consideration of the amount of side information. For low bitrates, where the number of actually transmitted waveform coefficients is low, missing coefficients are shaped w.r.t. the spectral envelope. In our approach, the envelope information is transmitted in form of band-wise values of the l1-norm. Sets of standardized audio files as well as various audio data of contemporary music are encoded and the results are analyzed with objective measures of perceptual quality. The proposed coding scheme competes in perceptual quality with existing state-of-the-art fixed bitrate coders such as MPEG-2/4 AAC. For low bitrates, the proposed embedded coding envelope restoration (ECER) improves the perceptual audio quality notably.
... The implementation of SPIHT would be much cheaper to be suitable for still image compression appliances. Moreover, the SPIHT based encoding algorithm is also applied to the SOT based audio compression [15]–[17] . There are still some interested issues on SPIHT based algorithms and applications. ...
Article
Set Partitioning in Hierarchical Trees (SPIHT) is a highly efficient technique for compressing Discrete Wavelet Transform (DWT) decomposed images. Though its compression efficiency is a little less famous than Embedded Block Coding with Optimized Truncation (EBCOT) adopted by JPEG2000, SPIHT has a straight forward coding procedure and requires no tables. These make SPIHT a more appropriate algorithm for lower cost hardware implementation. In this paper, a modified SPIHT algorithm is presented. The modifications include a simplification of coefficient scanning process, a 1-D addressing method instead of the original 2-D arrangement of wavelet coefficients, and a fixed memory allocation for the data lists instead of a dynamic allocation approach required in the original SPIHT. Although the distortion is slightly increased, it facilitates an extremely fast throughput and easier hardware implementation. The VLSI implementation demonstrates that the proposed design can encode a CIF (352 × 288) 4:2:0 image sequence with at least 30 frames per second at 100-MHz working frequency.
... Most existing algorithms use a single type of tree as shown in Figure 1b with the fixed parent-children relationship O(i) = iN + {0, 1, · · · , N − 1} for different positive integers N. For the MDCT transform, N = 4 was adopted in [6,7,8,9] and the wavelet packet transform was encoded using N = 2 in [10,11]. This type of tree will be referenced in the following as SPIHT-style significance trees. ...
Conference Paper
Full-text available
A fine-grain scalable and efficient audio compression scheme based on adaptive significance-trees is presented. Common approaches for 2-D image compression like EZW (embedded wavelet zero tree) and SPIHT (set partitioning in hierarchical trees) use a fixed significance-tree that captures well the inter- and intraband correlations of wavelet coefficients. For 1-D audio signals, such rigid coefficient correlations are not present. We address this problem by dynamically selecting an optimal significance-tree for the actual audio frame from a given set of possible trees. Experimental results are given, showing that this coding scheme outperforms single-type tree coding schemes and performs comparable to the MPEG AAC coder while additionally achieving fine-grain scalability
Article
File sharing networks are one of the most popular Internet applications nowadays. Despite its success, these networks have two main issues: the ineffectiveness in downloading content that few users in the network have and the high incidence of getting files different from the ones the user supposed to ask for. This paper presents new mechanisms for indexing and segmentation of audio content, aiming to improve the user experience when facing the problems previously mentioned. Our proposal has been validated through tests, with good results in comparison with the approach used by popular file sharing applications.
Article
Set partitioning in hierarchical trees (SPIHT) was known for its relatively simple implementation and flexible scalability when it is combined with discrete wavelet transform (DWT). The authors propose a method called combined significance map coding (CSMC) to improve the coding efficiency of SPIHT when used with block-based discrete cosine transform (DCT). CSMC groups some blocks and encodes the combined significance map of one to several blocks together. Lots of bits spent in significance map coding can be saved when the trees constructed with block DCT coefficients have similar locality. From our simulation results, CSMC improves significantly when in comparison with the original SPIHT coder using DWT and DCT. It also yields better performance than JPEG2000, and even outperforms the non-scalable H.264 intra-mode coder for some test images. No coding table is required, and fine rate/quality scalability property of SPIHT is still preserved.
Article
This paper studies the fine-grain scalable compression problem with emphasis on 1-D signals such as audio signals. Like in the successful 2-D still image compression techniques embedded zerotree wavelet coder (EZW) and set partitioning in hierarchical trees (SPIHT), the desired fine-granular scalability and high coding efficiency are benefited from a tree-based significance mapping technique. A significance tree serves to quickly locate and efficiently encode the important coefficients in the transform domain. The aim of this paper is to find such suitable significance trees for compressing dynamically variant 1-D signals. The proposed solution is a novel dynamic significance tree (DST) where, unlike in existing solutions with a single type of tree, a significance tree is chosen dynamically out of a set of trees by taking into account the actual coefficients distribution. We show how a set of possible DSTs can be derived that is optimized for a given (training) dataset. The method outperforms the existing scheme for lossy audio compression based on a single-type tree (SPIHT) and the scalable audio coding schemes MPEG-4 BSAC and MPEG-4 SLS. For bitrates less than 32 kbps, it results in an improved perceived audio quality compared to the fixed-bitrate MPEG-2/4 AAC audio coding scheme while providing progressive transmission and finer scalability.
Conference Paper
In SOT based coding methods, ordering of the coefficients could affect coding efficiency. A small coefficient placed close to tree root usually reduces the coding efficiency significantly. Hence, a good reordering scheme for a SOT is critical. In digital audio coding, information of neighboring frames is usually highly correlated. Based on this property, a SOT based audio coding scheme with a coefficient reordering method is proposed. The proposed coder may have the coefficients of the current frame reordered depending on the decoded results of some previous frames. The coding qualities of the proposed coder and MPEG-4 AAC coder are compared using ODG (Objective Differential Grade) on a variety of sound sources.
Article
Full-text available
A perceptually enhanced prioritized bit-plane audio coding algorithm is presented in this paper. According to the energy distribution in different frequency regions, the bit-planes are prioritized with optimized parameters. Based on the statistical modeling of the frequency spectrum, a much more simplified implementation of prioritized bit-plane coding is integrated with the recent release of MPEG-4 scalable lossless (SLS) audio coding structure by replacing the sequential bit-plane coding in the enhancement layer. With zero extra side information, trivial added complexity, and modification to the original SLS structure, extensive experimental results show that the perceptual quality of SLS with noncore and very low core bit-rate is improved significantly in a wide range of bit-rate combinations. Fully scalable audio coding up to lossless with much enhanced perceptual quality is thus achieved. Index Terms—Bit-plane coding, scalable audio coding (SAC).
Article
Full-text available
The paper addresses a bitstream scalable coder based on the MPEG-4 scalable lossless (SLS) coding system where, in contrast to SLS, the bitrate of the enhancement layer is not fixed but instead an attempt is made to create a quality-fixed enhancement layer. With a PCM audio input, the proposed structure is able to produce an audio version with near-transparent quality on top of the existing low-quality version. In particular, the proposed fixed quality enhancing process with checking procedures is able to provide the minimum amount of enhancement for the low-quality version to obtain a near-transparent quality that is almost indistinguishable from the CD quality. In addition, a bitrate estimation model is proposed. The model enables the direct estimation of the enhancing bitrate from two parameters extracted from the encoding process of the low-quality version. Evaluation results indicate that a better defined quality level is guaranteed compared to a fixed bitrate setting and that in the mean a lower (approximately 20%) bitrate is attained. It is also shown that the estimation model proposed is able to accurately predict the necessary enhancing bitrate and at the same time, reduce the complexity by around 17%.
Article
Full-text available
This thesis studies the techniques and feasibility of embedding a perceptual audio coder within a lossless compression scheme. The goal is to provide for two step scalability in the resulting bitstream, where both a perceptual version of the audio signal and a lossless version of the same signal are provided in the one bitstream. The focus of this thesis is the selection of the perceptual coder to be used as the perceptual base layer and the techniques to be used to compress the lossless layer by using backward linear prediction followed by entropy coding. The perceptual base layer used is MPEG-4 AAC, chosen based on entropy measurements of the residual signal. Results of the work in this thesis show that the embedded lossless coding scheme could achieve an average compression ratio of only 6% larger compared to lossless only coding. Performing decorrelation on the AAC residual signal by means of backward linear predictive coding and measuring the entropy of the resulting LPC residual signal of various orders revealed that an 8% decrease in coding rate is achievable using 15th order prediction. Furthermore, this thesis also investigates an entropy coding technique known as cascade coding which is originally designed to compress hydroacoustic image data and is modified to compress audio data. Cascade coding is an entropy coding technique that uses multiple cascaded stages where each stage codes a specific range of integers and is used to perform entropy coding of the backward linear prediction residual signal. The cascade coding technique explored in this thesis includes using a frame based approach and trained codebooks.
Conference Paper
Full-text available
This paper presents a scalable to lossless compression scheme that allows scalability in terms of sampling rate as well as quantization resolution. The scheme presented is perceptually scalable and it also allows lossless compression. The scheme produces smooth objective scalability, in terms of SNR, until lossless compression is achieved. The scheme is built around the perceptual SPIHT algorithm, which is a modification of the SPIHT algorithm. Objective and subjective results are given that show perceptual as well as objective scalability. The subjective results given also show that the proposed scheme performs comparably with the MPEG-4 AAC coder at 16, 32 and 64 kbps.
Article
Full-text available
A perceptually enhanced prioritized bit-plane audio coding algorithm is presented in this paper. According to the energy distribution in different frequency regions, the bit-planes are prioritized with optimized parameters. Based on the statistical modeling of the frequency spectrum, a much more simplified implementation of prioritized bit-plane coding is integrated with the recent release of MPEG-4 scalable lossless (SLS) audio coding structure by replacing the sequential bit-plane coding in the enhancement layer. With zero extra side information, trivial added complexity, and modification to the original SLS structure, extensive experimental results show that the perceptual quality of SLS with noncore and very low core bit-rate is improved significantly in a wide range of bit-rate combinations. Fully scalable audio coding up to lossless with much enhanced perceptual quality is thus achieved.
Article
In this paper, the compression of multispectral images is addressed. Such 3-D data are characterized by a high correlation across the spectral components. The efficiency of the state-of-the-art wavelet-based coder 3-D SPIHT is considered. Although the 3-D SPIHT algorithm provides the obvious way to process a multispectral image as a volumetric block and, consequently, maintain the attractive properties exhibited in 2-D (excellent performance, low complexity, and embeddedness of the bit-stream), its 3-D trees structure is shown to be not adequately suited for 3-D wavelet transformed (DWT) multispectral images. The fact that each parent has eight children in the 3-D structure considerably increases the list of insignificant sets (LIS) and the list of insignificant pixels (LIP) since the partitioning of any set produces eight subsets which will be processed similarly during the sorting pass. Thus, a significant portion from the overall bit-budget is wastedly spent to sort insignificant information. Through an investigation based on results analysis, we demonstrate that a straightforward 2-D SPIHT technique, when suitably adjusted to maintain the rate scalability and carried out in the 3-D DWT domain, overcomes this weakness. In addition, a new SPIHT-based scalable multispectral image compression algorithm is used in the initial iterations to exploit the redundancies within each group of two consecutive spectral bands. Numerical experiments on a number of multispectral images have shown that the proposed scheme provides significant improvements over related works.
Article
Full-text available
Recent papers have proposed linear prediction as a useful method for lossless audio coding. Transform coding, however, has hardly been investigated, although it seems to be more suited for the harmonic structure of most audio signals. In this paper we present some results on lossless transform coding of CD-quality audio data. One main aspect lies on a convenient quantization method to guarantee perfect reconstruction. We achieve bit rates which are lower than those obtained by lossless linear prediction schemes.
Article
Full-text available
This paper discusses the application of the Set Par- titioning In Hierarchical Trees (SPIHT) algorithm to the compression of audio signals. Simultaneous masking is used to reduce the number of coefficients required for the representation of the audio signal. The proposed scheme is based on the combina- tion of the Modulated Lapped Transform (MLT) and SPIHT. Comparisons are also made with the Discrete Wavelet Transform (DWT) based scheme. Results presented reveal the compression achieved as well as the scalability of the proposed coding scheme. The MLT based scheme is shown to have compres- sion performance that is superior to the DWT based scheme.
Article
Full-text available
Embedded zerotree wavelet (EZW) coding, introduced by J. M. Shapiro, is a very effective and computationally simple technique for image compression. Here we offer an alternative explanation of the principles of its operation, so that the reasons for its excellent performance can be better understood. These principles are partial ordering by magnitude with a set partitioning sorting algorithm, ordered bit plane transmission, and exploitation of self-similarity across different scales of an image wavelet transform. Moreover, we present a new and different implementation, based on set partitioning in hierarchical trees (SPIHT), which provides even better performance than our previosly reported extension of the EZW that surpassed the performance of the original EZW. The image coding results, calculated from actual file sizes and images reconstructed by the decoding algorithm, are either comparable to or surpass previous results obtained through much more sophisticated and computationally complex methods. In addition, the new coding and decoding procedures are extremely fast, and they can be made even faster, with only small loss in performance, by omitting entropy coding of the bit stream by arithmetic code.
Conference Paper
The perceptual entropy of each short-term section of the audio stimuli is estimated as the number of bits required to encode the short-term spectrum of the signal to the resolution measured by this process provide an entropy estimate, for transparent coding, of 1.4 (mean) or 2.1 (peak) bits/sample for telephone speech (200-3200-Hz bandwidth sampled at 8 kHz). The entropy measures for audio signals of other bandwidths and sampling rates is also reported
Article
Lossless audio compression is likely to play an important part in music distribution over the Internet, DVD audio, digital audio archiving, and mixing. The article is a survey and a classification of the current state-of-the-art lossless audio compression algorithms. This study finds that lossless audio coders have reached a limit in what can be achieved for lossless compression of audio. It also describes a new lossless audio coder called AudioPak, which low algorithmic complexity and performs well or even better than most of the lossless audio coders that have been described in the literature
Australian Postgraduate Award of Industry (APAI) grant. This work is supported by the Motorola Australian Research Centre. The authors wish to thank Ms. Melanie Jackson for her help in conducting the listening tests. 7. REFERENCES [I] M Lossless compression of digital audio
  • R W Schafer
Mohammed Raad is a recipient of an Australian Postgraduate Award of Industry (APAI) grant. This work is supported by the Motorola Australian Research Centre. The authors wish to thank Ms. Melanie Jackson for her help in conducting the listening tests. 7. REFERENCES [I] M. Hans and R.W. Schafer, " Lossless compression of digital audio, " IEEE Signal Processing magazine, vol. 18, no. 4, pp. 21-32, July 2001.