Article

Modulated filter banks with arbitrary system delay: Efficient implementations and the time-varying case

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

We present a new method for the design and implementation of modulated filter banks with perfect reconstruction. It is based on the decomposition of the analysis and synthesis polyphase matrices into a product of two different types of simple matrices, replacing the polyphase filtering part in a modulated filter bank. Special consideration is given to cosine-modulated as well as time-varying filter banks. The new structure provides several advantages. First of all, it allows an easy control of the input-output system delay, which can be chosen in single steps of input sampling rate, independent of the filter length. This property can be used in audio coding applications to reduce pre-echoes. Second, it results in a structure that is nearly twice as efficient as performing the polyphase filtering directly. Perfect reconstruction is a structurally inherent feature of the new formulation, even for nonlinear operations or time-varying coefficients. Hence, the structure is especially suited for the design of time-varying filter banks where both the number of bands as well as the prototype filters can be changed while maintaining perfect reconstruction and critical sampling. Further, a proof of effective completeness is given, and the design of equal magnitude-response analysis and synthesis filter banks is described. Filter design can be performed by nonconstrained optimization of the matrix coefficients according to a given cost function. Design and audio-coding application examples are given to show the performance of the new filter bank

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... We describe the filter bank as a polyphase filter matrix. A maximally-decimated filter bank can be described by the socalled polyphase description [8]. The main advantage of the polyphase description is its mathematical compactness, so that a filter bank is fully described by a polyphase filter matrix. ...
... S(z) is a shift matrix that advances a block or vector by one entry (see [8]). ...
... In order to express this time dependency, the parameter m, denoting the time instance, is introduced. Thus G(z) becomes G(z, m) (see [8]), and the time signalX(z) is now obtained using the formulaX ...
Conference Paper
Full-text available
We describe an efficient conversion method, which directly converts a desired spectral representation from compressed audio material. The conversion method provides a feature extraction algorithm with a suitable complex frequency representation of an audio signal. The presented conversion allows us to trade off computational complexity with accuracy. We then test several operating points with an MPEG audio feature extraction system. That leads, in general, to a reduction of the computational complexity from O (N log N) to O(N), compared to the conventional method of first decoding and then applying the DFT to the resulting time domain audio signal.
... In [20] and [21], the PR conditions were derived and biorthogonal CMFBs were designed numerically by using a quadratic-constrained optimization procedure [15]. In comparison, the PR property can be structurally guaranteed in the factorization-based approach [22]. The connection between the methods of [21] and [22] was established in [23]. ...
... In comparison, the PR property can be structurally guaranteed in the factorization-based approach [22]. The connection between the methods of [21] and [22] was established in [23]. More recently, two second-order cone-programming based algorithms were proposed in [27] for designing nearly PR and practically PR CMFBs, which can make the filter banks extremely close to the PR property. ...
... Setting the gradient vector of the objective function to be zero, we have (20) By invoking the matrix inversion lemma [35], (20) can be rewritten as (21) where (22) ...
Article
Full-text available
This paper presents several new properties of biorthogonal cosine modulated filter banks (CMFBs) and efficient algorithms for designing CMFBs with a very large number of subbands and very long filters. For a biorthogonal CMFB, we find the periodicity and symmetry of its overall transfer function and aliasing transfer functions which can be efficiently computed based on a decimated uniform discrete Fourier transform (DFT) analysis filter bank. By exploiting gradient information and 2 M th band conditions, efficient algorithms are proposed for designing both orthogonal and biorthogonal CMFBs. In addition, an efficient matrix inversion algorithm with O ( N <sup>2</sup> ) complexity is also presented. Several numerical examples and comparisons with many other existing methods are included to demonstrate the design performance and efficiency of the algorithms.
... However, these codecs usually operate on long frame lengths for good frequency selectivity. They also typically use orthogonal filterbanks such as Modified Discrete Cosine Transform (MDCT) [5], due to which, the delay contributed by the filterbank depends on the length of the prototype low-pass filter [8]. Hence, they are usually characterized by high algorithmic delays, making them unsuitable for full-duplex communication. ...
... While the SBR technology [23][24] improves coding efficiency, LD-SBR tool also minimizes the introduced delay by avoiding the use of variable time grid [6,22] and by using low-delay analysis and synthesis quadrature mirror filter (QMF) banks [22]. The delay of the new core coder filterbank is independent of filter length [6,8] and hence, a window with multiple overlap for good frequency selectivity can be used. Parts of the window that access future input values are zeroed out, thus reducing the delay further. ...
... The MDCT and IMDCT are defined as follows [5,8]: ...
Conference Paper
Full-text available
Recently MPEG has developed a new audio coding standard - MPEG-4 AAC enhanced low delay (ELD), targeting low bit rate, full-duplex communication applications such as audio and video conferencing. The AAC-ELD combines low delay SBR filterbank with a low delay core coder filterbank to achieve both high coding efficiency and low algorithmic delay. In this paper, we propose an efficient mapping of the AAC-ELD core coder filterbank to the well known MDCT. This provides a fast algorithm for the new filterbank. Since AAC-LD and AAC-LC profiles also use MDCT filterbank, this mapping enables efficient joint implementation of filterbanks for all 3 profiles. We also present a very efficient 15-point DCT-II algorithm that is useful for implementation of all 3 profiles with frame lengths of 960 and 480. This algorithm requires just 17 multiplications and 67 additions. The overall design structure and complexity analysis of proposed implementation of the filterbanks is also provided.
... The design method used here was first described in [14,15], and later in [16,17] ...
... The low-delay filterbank can be implemented as efficiently as a regular MDCT (see also [17]) using the general structure as illustrated in Figure 10. and thus do not involve any operation. For the by 2 · M extended overlap into the past, only M additional multiply-add operations are required. ...
... For the by 2 · M extended overlap into the past, only M additional multiply-add operations are required. In [17] these additional operations are referred to as "zerodelay matrices". From publications in the area of integer filterbanks these operations are also known as "lifting steps" [21]. ...
Conference Paper
Full-text available
The MPEG-4 Low Delay Advanced Audio Coding (AAC-LD) scheme has recently evolved into a popular algorithm for audio communication. It produces excellent audio quality at bitrates between 64 kbit/s and 48 kbit/s per channel. This paper introduces an enhancement to AAC-LD which reduces the bitrate demand by 25-33 %. This is achieved by adding both a delay-optimized version of the Spectral Band Replication (SBR) tool and by utilizing a dedicated low delay filterbank. The introduced techniques maintain the high audio quality and offer an algorithmic delay low enough for use in two way communication systems. This paper describes the coder enhancements including a detailed discussion of algorithmic delay issues, a performance assessment and possible applications.
... These filter banks can be designed with a decomposition of its structure into a DCT and a cascade of preprocessing steps. In a polyphase representation these preprocessing steps show up as maximum-delay and zero-delay matrices and a diagonal factor matrix [10], see Figure 3. The maximum-delay and zero-delay matrices can be seen as a set of lifting steps. ...
... With these basic matrices, the following product or decomposition is a general form for F a (z) [10], which includes low delay filter banks and conventional filter banks like the Modified Discrete Cosine Transform (MDCT), also known as TDAC filter bank or lapped orthogonal transform. ...
Conference Paper
Full-text available
Recently, lifting-based integer approximations of filter banks have received much attention, especially in the field of image coding. The application of the techniques to cosine modulated filter banks for audio coding, including not only the modified discrete Fourier transform (MDCT) but also low delay filter banks are focused on. Applications of the integer filter banks include lossless audio coding and backward compatible lossless enhancement of MDCT-based perceptual audio coding schemes, such as MPEG-2/4 AAC.
... In the appendix we show that all parts of the upper equalities hold true if the prototypes satisfy the PR constraints given in (10) and (11), which connects the PR constraints on the polyphase "lters as given in [1] with the coe$cients of the maximum-delay matrices introduced in [14]. ...
... We show in the appendix that all parts of the upper two equations are ful"lled for PR prototypes satisfying the constraints from [1] given in (10) and (11). Thus, (47) establishes the relationship between the direct formulation of the PR constraints on the polyphase "lters from [1] and the coe$cients of the zero-delay matrices from [14]. Note that we only get a causal solution for ...
Article
Biorthogonal modulated filter banks, when compared to paraunitary ones, provide the advantage that the overall system delay can be chosen independently of the filter length, thus allowing to design low delay filter banks. They have recently been studied by several authors. In this paper, we connect two different design methods, namely the quadratic constrained least-squares optimization and the principle of cascading sparse self-inverse matrices. Moreover, we show how factorizations into zero-delay and maximum-delay matrices can be utilized in order to achieve desirable features such as structure-inherent perfect reconstruction, no DC leakage of the filter bank, and a low implementation cost.
... Heller et al. [22] and Schuller et al. [23] proposed two different approaches for designing perfect reconstruction cosine-modulated filter banks with arbitrary delay. The authors of these two papers derived the necessary and sufficient conditions for low-delay perfect reconstruction biorthogonal cosine-modulated filter banks (different prototype filters for analysis and synthesis filter banks) using polyphase structure. ...
... Accordingly, the prototype filter is designed to minimize a weighted sum of the stopband energy and the difference between the ideal low-pass filter and the designed filter in the pass-band subject to the constraints in (22). This leads to a filter design procedure that aims at minimizing the objective function (23) subject to the constraints in (22). In (23), is a tradeoff parameter. ...
Article
Full-text available
This paper presents a method for designing low-delay nonuniform pseudo quadrature mirror filter (QMF) banks. This method is motivated by the work of Li, Nguyen, and Tantaratana, in which the nonuniform filter bank is realized by combining an appropriate number of adjacent sub-bands of a uniform pseudo-QMF bank. In prior work, the prototype filter of the uniform pseudo-QMF bank was constrained to have linear phase and the overall delay associated with the filter bank was often unacceptably large for filter banks with a large number of sub-bands. This paper proposes a pseudo-QMF filter bank design technique that significantly reduces the delay by relaxing the linear phase constraints. An example in which an oversampled critical-band nonuniform filter bank is designed and applied to a two-state modeling speech enhancement system is presented in this paper. Comparison of the performance of this system to competing methods employing tree-structured, linear phase multiresolution analysis indicates that the approach described in this paper strikes a good balance between system performance and low delay
... type-IV discrete cosine transform (DCT-IV) generally for cosine-modulated filter banks with arbitrary system delay are indicated in [12][13][14], and similarly for the LD-MDCT filter banks in [8], they have not been further investigated in detail. The AAC-ELD defines the window length to be 2N with N ¼1024, or N ¼ 960 [1]. ...
... The symmetry properties of data sequences fc TDAC k g and fx TDAC n g as well as the general mathematical and special properties of the TDAC-MDCT are presented in [25]. It can be easily seen that the cosine transform kernels of the LD-MDCT in (9) and (12), and those of the TDAC-MDCT in (13) and (14), differ only by the shift factor 8 N=2. ...
Article
Recently, the MPEG committee has completed the development of a new audio coding standard, the MPEG-4 Advanced Audio Coding-Enhanced Low Delay (AAC-ELD). The state-of-the-art MPEG audio coding standards, such as MPEG-4 AAC Low Complexity (AAC-LC), High Efficiency AAC (HE-AAC) and AAC Low Delay (AAC-LD), utilize for the time-to-frequency transformation of an audio block and vice versa, the well known time domain aliasing cancellation modified discrete cosine transform (TDAC-MDCT). In order to achieve low algorithmic delay, the AAC-ELD has adopted a perfect reconstruction low delay filter bank, called the low delay MDCT (LD-MDCT). New fast algorithms for the LD-MDCT computation in the AAC-ELD are proposed. Further, a simple relation between the LD-MDCT and the TDAC-MDCT is derived. Exploiting this relation, as an alternative, the improved fast LD-MDCT algorithms based on the TDAC-MDCT are presented. They do not require reverse operations and sign changes with respect to both the time and frequency indices compared to existing TDAC-MDCT-based fast algorithms. Since the AAC-LC, HE-AAC and AAC-LD audio codecs use the TDAC-MDCT, the new proposed and improved TDAC-MDCT-based fast LD-MDCT algorithms provide the unified efficient implementation of LD-MDCT and TDAC-MDCT filter banks in all four codecs: AAC-ELD, AAC-LD, HE-AAC and AAC-LC.
... This causes still a comparatively high signal delay which depends, among others, on the permitted aliasing distortions. Many designs of uniform and non-uniform filter-banks allow to prescribe an (almost) arbitrary signal delay, e.g., [16, 18, 40, 69]. However, it is problematic to achieve simultaneously a high stopband attenuation for the subband filters as well as a low signal delay. ...
Chapter
Full-text available
Digital filter-banks are an integral part of many speech and audio processing algorithms used in today’s communication systems. They are commonly employed for adaptive subband filtering, for example, to perform acoustic echo cancellation in hands-free communication devices or multi-channel dynamic-range compression in digital hearing aids, e.g., [34,81]. Another frequent task is speech enhancement by noise reduction, e.g., [4,81]. This eases the communication in adverse environments where acoustic background noise impairs the intelligibility and fidelity of the transmitted speech signal. A noise reduction system is also beneficial to improve the performance of speech coding and speech recognition systems, e.g., [41].
... The goal now is to show that the cosine-modulation matrix of an FB with filter order can be implemented as (34) with given by (6h). 9 A similar proof for a different type of cosine-modulation functions than the one used in this paper can be found in [18]. 10 Similar relations can be derived for N = 2K M 0 1, with K being an odd integer. ...
Article
Full-text available
The analysis and synthesis parts of a cosine-modulated M-channel filterbank (FB) contain two sections, a modulation block and a prototype filter implemented in a polyphase structure. Although, in many cases, a linear-phase prototype filter is used, the coefficient symmetry of this filter is not utilized when using the existing polyphase structure. In this paper, a method is proposed for implementing a linear-phase prototype filter building a nearly perfect reconstruction cosine-modulated FB in such a way that it enables one to partially utilize the coefficient symmetry, thereby reducing the number of required multiplications in the implementation. The proposed method can be applied for implementing FBs with an arbitrary filter order and number of channels. Moreover, it is shown that, in all cases under consideration, the cosine-modulation part of the FB can be implemented by using a fast discrete cosine transform. The efficiency of the proposed implementation is evaluated by means of examples.
... There are many examples in the literature of lapped orthogonal transforms that meet the above requirements, both in the 1D case (e.g., Princen et al. [16], Malvar [17], and Schuller et al. [18]) and the 2D case (e.g., Kovacevic et al. [19] and Johnson et al. [20]). In this work, we focus on the so-called Modified Discrete Cosine Transform (MDCT) [16], used in current state-of-art audio coders. ...
Article
Full-text available
Consider a nonparametric representation of acoustic wave fields that consists of observing the sound pressure along a straight line or a smooth contour L defined in space. The observed data contains implicit information of the surrounding acoustic scene, both in terms of spatial arrangement of the sources and their respective temporal evolution. We show that such data can be effectively analyzed and processed in what we call the space-time-frequency representation space, consisting of a Gabor representation across the spatio-temporal manifold defined by the spatial axis L and the temporal axis t . In the presence of a source, the spectral patterns generated at L have a characteristic triangular shape that changes according to certain parameters, such as the source distance and direction, the number of sources, the concavity of L , and the analysis window size. Yet, in general, the wave fronts can be expressed as a function of elementary directional components-most notably, plane waves and far-field components. Furthermore, we address the problem of processing the wave field in discrete space and time, i.e., sampled along L and t , where a Gabor representation implies that the wave fronts are processed in a block-wise fashion. The key challenge is how to chose and customize a spatio-temporal filter bank such that it exploits the physical properties of the wave field while satisfying strict requirements such as perfect reconstruction, critical sampling, and computational efficiency. We discuss the architecture of such filter banks, and demonstrate their applicability in the context of real applications, such as spatial filtering, deconvolution, and wave field coding.
... For good frequency localization in filter banks with a large number of subbands, the filter length N must be large, and therefore the delay associated with the filter bank will also be large. Heller et al. [4] and Schuller et al. [10] proposed two different approaches for designing perfect reconstruction cosine-modulated filter banks with arbitrary delay. However, perfect reconstruction is overly restrictive in many practical applications. ...
Article
Full-text available
ABSTRACT This paper presents a method for designing low-delay nonuniform pseudo QMF banks. The method is motivated by the work of Li, Nguyen and Tantaratana, in which the nonuniform filter bank is realized by combining an appropriate number of adjacent subbands of a uniform pseudo QMF filter bank. In prior work, the prototype filter of the uniform pseudo QMF is constrained to have linear phase and the overall delay associated with the filter bank was often unacceptably large for filter banks with a large number of subbands. By relaxing the linear phase constraints, this paper proposes a pseudo QMF filter bank design technique that significantly reduces the delay. An example that experimentally verifies the capabilities of the design technique is presented.
... A paraunitary AS FB has a system delay of L samples, e.g., [20]. A lower delay of down to M −1 samples can be achieved by a biorthogonal cosine modulated AS FB [22]. An alternative is to employ the lifting scheme for the design of uniform and allpass transformed low delay filter-banks [23,24]. ...
Article
A versatile filter-bank concept for adaptive subband filtering is proposed, which achieves a significantly lower algorithmic signal delay than commonly used analysis–synthesis filter-banks. It is derived as an efficient implementation of the filter-bank summation method and performs time-domain filtering with coefficients adapted in the uniform or non-uniform frequency-domain. The frequency warped version of the proposed filter-bank has a lower computational complexity than the usual warped analysis–synthesis filter-bank for most parameter configurations. The application to speech enhancement shows that the same quality of the enhanced speech can be achieved but with lower signal delay. For systems with tight signal delay requirements, modifications of the new filter-bank design are discussed to further decrease its signal delay by approximating the original time-domain filter by an FIR or IIR filter of lower degree. This approach can achieve a very low signal delay and reduced computational complexity with almost no loss for the perceived speech quality.
... Further developments of the method include first of all a pitch synchronous version. The starting point is the time varying case of modulated filter banks realized in [9]. This would give to the method a higher level of generality. ...
Article
Full-text available
In previous papers (1), (2) we introduced a model for pseudo-periodic sounds based on Wornell results (3) concerning the synthesis of /f noise by means of the Wavelet transform (WT). This method provided a good model for representing not only the harmonic part of real- life sounds but also the stochastic components. The latter are of fundamental importance from a perceptual point of view since they contain all the information related to the natural dynamic of musical timbres. In this paper we introduce a refinement of the method, making the spectral- model technique more flexible and the resynthesis coefficient model more accurate. In this way we obtain a powerful tool for sound processing and cross-synthesis.
... Other methods have been proposed for the design of perfect-reconstruction or paraunitary filter banks, which is a class of perfect-reconstruction filter banks. A method using nonconstrained optimization is described in [11], for the design of perfect reconstruction polyphase filter banks with arbitrary delay, aimed at applications in audio coding. Kliewer proposed a method for linear-phase prototype FIR filters with power complementary constraints for cosine modulated filter banks, based on an improved frequency-sampling design [12]. ...
Article
Adaptive filtering is an important subject in the field of signal processing and has numerous applications in fields such as speech processing and communications. Examples in speech processing include speech enhancement, echo- and interference- cancellation, and speech coding. Subband filter banks have been introduced in the area of adaptive filtering in order to improve the performance of time domain adaptive filters. The main improvements are faster convergence speed and the reduction of computational complexity due to shorter adaptive filters in the filter bank subbands. Subband filter banks, however, often introduce signal degradations. Some of these degradations are inherent in the structure and some are inflicted by filter bank parameters, such as analysis and synthesis filter coefficients. Filter banks need to be designed so that the application performance degradation is minimized. The presented design methods in this thesis aim to address two major filter bank properties, transmission delay in the subband decomposition and reconstruction as well as the total processing delay of the whole system, and distortion caused by decimation and interpolation operations. These distortions appear in the subband signals and in the reconstructed output signal. The thesis deals with different methods for filter bank design, evaluated on speech signal processing applications with filtering in subbands. Design methods are developed for uniform modulated filter banks used in adaptive filtering applications. The proposed methods are compared with conventional methods. The performances of different filter bank designs in different speech processing applications are compared. These applications are acoustic echo cancellation, speech enhancement including spectral estimation, subband beamforming, and subband system identification. Real speech signals are used in the simulations and results show that filter bank design is of major importance.
... The design method used in this paper was first described in [6,7], and later in [1,8] combining the desired properties. The resulting filterbanks have the same cosine modulation function as the traditional MDCT, but can have longer window functions which can be non-symmetric, with a generalized or low reconstruction delay. ...
Conference Paper
Full-text available
Low delay perceptual audio coding has recently gained wide acceptance for high quality communication. While common schemes are based on the well-known Modified Discrete Cosine Transform (MDCT) filterbank, this paper describes novel coding algorithms that, for the first time, make use of dedicated low delay filterbanks, thus achieving improved coding efficiency while maintaining or even reducing the low codec delay. The MPEG-4 Enhanced Low Delay AAC (AAC-ELD) coder currently under development within ISO/MPEG combines a traditional perceptual audio coding scheme with spectral band replication (SBR), both running in a delay-optimized fashion by using low delay filterbanks.
... Recent progress in the analysis and design of BCM filter banks have been reported by several authors, see, for example, [18]- [21], [23]- [26], and [28]. Available design techniques for BCM filter banks now include the quadratic-constrained least-squares (QCLS) method [23], [20], [6] that minimizes the stopband energy of the PF subject to the PR constraints expressed as quadratic equalities in terms of the coefficients of the PF; the factorization-based method [19], [24], [25] that yields a parameterized realization in which the PR property is always ensured while minimizing the stopband energy of the PF in an unconstrained optimization setting; and the sequential design method [28] that is carried out by first designing a filter bank with a small number of channels and a relatively short filter length and then gradually increasing the number of channels as well as the filter length using a technique initiated in [5]. In addition, quadratic-constrained-optimization based algorithms [10] and fast design through window function optimization [11] [22] for orthogonal cosine-modulated (OCM) filter banks have been proposed. ...
Article
Full-text available
Designing optimal perfect-reconstruction (PR) and near PR (NPR) cosine-modulated filter banks is essentially a constrained nonlinear minimization problem. This paper proposes two second-order cone-programming based algorithms for designing NPR and practically PR cosine-modulated filter banks with improved performance relative to several established design methods.
... This is too long for communications applications. A remedy for the filter bank delay is to use switchable low-delay filter banks [5], [6] instead of the traditionally used MDCT filter bank. This low-delay filter bank also has the two modes of 128 and 1024 bands. ...
Article
Full-text available
This paper proposes a versatile perceptual audio coding method that achieves high compression ratios and is capable of low encoding/decoding delay. It accommodates a variety of source signals (including both music and speech) with different sampling rates. It is based on separating irrelevance and redundancy reductions into independent functional units. This contrasts traditional audio coding where both are integrated within the same subband decomposition. The separation allows for the independent optimization of the irrelevance and redundancy reduction units. For both reductions, we rely on adaptive filtering and predictive coding as much as possible to minimize the delay. A psycho-acoustically controlled adaptive linear filter is used for the irrelevance reduction, and the redundancy reduction is carried out by a predictive lossless coding scheme, which is termed weighted cascaded least mean squared (WCLMS) method. Experiments are carried out on a database of moderate size which contains mono-signals of different sampling rates and varying nature (music, speech, or mixed). They show that the proposed WCLMS lossless coder outperforms other competing lossless coders in terms of compression ratios and delay, as applied to the pre-filtered signal. Moreover, a subjective listening test of the combined pre-filter/lossless coder and a state-of-the-art perceptual audio coder (PAC) shows that the new method achieves a comparable compression ratio and audio quality with a lower delay.
... Biorthogonal cosine-modulated filter banks with perfect reconstruction (PR) have been studied in [1, 2, 3]. Other than paraunitary filter banks they allow to design the system delay independently of the filter length, thus, resulting in a better stopband attenuation and a smaller transition bandwidth for a given system delay than paraunitary filter banks. ...
Article
Full-text available
This paper describes the implementation of biorthogo-nal cosine-modulated filter banks on fixed-point arith-metic digital signal processors. The proposed imple-mentation has the property that the overall filter bank keeps the perfect reconstruction property despite coef-ficient quantization, overflow, and rounding of interme-diate results. The realization of the prototype filter is based on a factorization into zero-delay and maximum-delay matrices. We demonstrate how the frequency se-lectivity of the filter bank and the coding gain changes with the available wordlength of the fixed-point imple-mentation and the dynamic range of the input signal. For speech signals it turns out that overflow and round-ing errors hardly affect the frequency selectivity of the filters if the input signal uses only 75% of the available dynamic range.
... Another emerging trend is one of convergence between low-rate audio coding algorithms and speech coders, which are increasingly embedding mechanisms to exploit perceptual irrelevancies [389], [390], [399], [400]. Research is also ongoing into potential improvements for the various perceptual coder building blocks, such as novel filter banks for low-delay coding and reduced pre-echo [391], [404] and new psychoacoustic signal analysis techniques [392], [393]. Researchers are also investigating new algorithms for tasks of peripheral interest to perceptual audio coding such as transform-domain signal modifications [394] and digital watermarking [395], [396]. ...
Article
During the last decade, CD-quality digital audio has essentially replaced analog audio. Emerging digital audio applications for network, wireless, and multimedia computing systems face a series of constraints such as reduced channel bandwidth, limited storage capacity, and low cost. These new applications have created a demand for high-quality digital audio delivery at low bit rates. In response to this need, considerable research has been devoted to the development of algorithms for perceptually transparent coding of high-fidelity (CD-quality) digital audio. As a result, many algorithms have been proposed, and several have now become international and/or commercial product standards. This paper reviews algorithms for perceptually transparent coding of CD-quality digital audio, including both research and standardization activities. The paper is organized as follows. First, psychoacoustic principles are described with the MPEG psychoacoustic signal analysis model 1 discussed in some detail. Next, filter bank design issues and algorithms are addressed, with a particular emphasis placed on the Modified Discrete Cosine Transform (MDCT), a perfect reconstruction (PR) cosine-modulated filter bank that has become of central importance in perceptual audio coding. Then, we review methodologies that achieve perceptually transparent coding of FM- and CD-quality audio signals, including algorithms that manipulate transform components, subband signal decompositions, sinusoidal signal components, and linear prediction (LP) parameters, as well as hybrid algorithms that make use of more than one signal model. These discussions concentrate on architectures and applications of
... The design method used here was first described in [23,24], and later in [25,26] presenting a combination of the desired properties. The resulting filter banks utilize the same cosine modulation function as the traditional MDCT. ...
Conference Paper
The MPEG Audio group has recently concluded the standardization process for the MPEG-4 Enhanced Low Delay AAC (AAC-ELD) codec. This codec is a new member of the MPEG Advanced Audio Coding family. It represents the efficient combination of the AAC Low Delay codec and the Spectral Band Replication (SBR) technique known from HE-AAC. This paper provides a complete overview of the underlying technology, presents points of operation as well as applications and discusses MPEG verification test results.
... These parameters are then input into an optimization procedure which, after convergence, provides the filter coefficients for the analysis and synthesis filters. There are other types of extensions that might also prove beneficial, such as the modulated filter bank techniques found by G. Schuller and T. Karp [20]. Our hope is that the interfaces that we have provided and the general organization of the toolkit is such that other users of the system find it easy to implement their own types of filters to fit their needs. ...
Article
Full-text available
This paper describes the design and implementation of Tsunami, a wavelet -based library built to encompass the range of research from offline analysis of computer generated resource signals to the construction and deployment of onli ne systems. Wavelet analysis has proven to be an invaluable analysis technique for discovering characteristics of signals and has been applied to many areas related to computer systems research. Tsunami is created mostly for use in distributed systems, domains where sensors are deployed to sample resource signals related to hosts and networks, and are used for making run-time decisions in applications. From the analysis of computer generated resource signals, online systems may be deployed using our toolkit to provide performance gains in user applications. With Tsunami, a user can seamlessly transition from simulation to deployment of a wavelet-based online system. The toolkit design is extremely general in that the provided interfaces can be used for almost any type of application that may benefit from wavelet approaches. It is also extensible and flexible, allowing users to customize their analysis using the coarse- and fine-grain building blocks provided in the toolkit. In this paper, we summarize the techniques of wavelet analysis for use in computer systems and provide implementation details of the library to provide a user with the power to wavelet-enable their application. We describe how the toolkit can be extended, how it performs in terms of sample rates and scalability, and how it can be used with the RPS toolkit to build distributed wavelet systems. Conclusions and future directions of our research related to this toolkit will be discussed. Tsunami is available from http://www.cs.northwestern.edu/~RPS.
Article
Full-text available
This article reviews Audio Signal Processing and Coding by Andreas Spanias, Ted Painter, Venkatraman Atti , Hoboken, New Jersey, 2007. 464 pp. Price $95.00 (hardcover), ISBN: 978-0-471-79147-8
Article
In recent years there has been a phenomenal increase in the number of products and applications which make use of audio coding formats. Amongthe most successful audio coding schemes, the MPEG-1 Layer III (mp3), the MPEG-2 Advanced Audio Coding (AAC) or its evolution MPEG-4High Efficiency-Advanced Audio Coding (HE-AAC) can be cited. More recently, perceptual audio coding has been adapted to achieve codingat low-delay such to become suitable for conversational applications. Traditionally, the use of filter bank such as the Modified Discrete CosineTransform (MDCT) is a central component of perceptual audio coding and its adaptation to low delay audio coding has become an important researchtopic. Low delay transforms have been developed in order to retain the performance of standard audio coding while reducing dramatically the associated algorithmic delay.This work presents some elements allowing to better accommodate the delay reduction constraint. Among the contributions, a low delay blockswitching tool which allows the direct transition between long transform and short transform without the insertion of transition window. The sameprinciple has been extended to define new perfect reconstruction conditions for the MDCT with relaxed constraints compared to the original definition.As a consequence, a seamless reconstruction method has been derived to increase the flexibility of transform coding schemes with the possibility toselect a transform for a frame independently from its neighbouring frames. Finally, based on this new approach, a new low delay window design procedure has been derived to obtain an analytic definition for a new family of transforms, permitting high quality with a substantial coding delay reduction. The performance of the proposed transforms has been thoroughly evaluated, an evaluation framework involving an objective measurement of the optimal transform sequence is proposed. It confirms the relevance of the proposed transforms used for audio coding. In addition, the new approaches have been successfully applied to the recent standardisation work items, such as the low delay audio coding developed at MPEG (LD-AAC and ELD-AAC) and they have been evaluated with numerous subjective testing, showing a significant improvement of the quality for transient signals. The new low delay window design has been adopted in G.718, a scalable speech and audio codec standardized in ITU-T and has demonstrated its benefit in terms of delay reduction while maintaining the audio quality of a traditional MDCT.
Article
This paper describes a systematic technique for designing prototype filters for generating perfect-reconstruction (PR) orthogonal cosine-modulated and modified discrete Fourier transform filterbanks. In the proposed design scheme, the stopband energy of the prototype filter is minimized, and the basic unknowns are the angles of a special lattice structure used for implementing the prototype filter so that the PR property is automatically satisfied independent of the angle values. This selection of the unknowns makes the overall optimization problem unconstrained. Due to the fact that there are several local optima, the design is performed in multiple steps in order to arrive at least at a very good suboptimal solution. First, for the given number of channels, the length of the channel filters, and the stopband edge of the prototype filter, the corresponding two-channel filterbank is designed based on the preoptimized data. Then, after knowing the angles for the optimized two-channel case, the prototype filter for the desired filterbank is generated by gradually increasing the number of channels and by properly using the result of the previous step as a start-up solution for the present step. The main benefit of the proposed design technique is that it enables one to effectively design prototype filters for filterbanks with very high-order analysis and synthesis filters.
Article
This paper studies nearly perfect reconstruction periodic sequences modulated filter banks, which includes many useful examples, e.g., the discrete Fourier transform (DFT), generalized DFT (GDFT), and cosine modulated ones. A general framework is developed to meet arbitrary but feasible design constraints, e.g., length of filters, system delay, decimation ratio, phase linearity, etc., and it also reveals new design freedoms, i.e., phase shifts in the modulation sequences. An efficient Newton algorithm is proposed, and a Matlab/Octave package with diverse design examples is provided. Numerical results show that better designs can be obtained by searching for the solutions in a larger space than the one from traditional frameworks.
Article
Unit-norm tight frames (UNTFs) are interesting in the context of robust transmission because of their robustness to erasures. In this correspondence, we show how to convert any finite-dimensional UNTF into an oversampled filter bank which implements a UNTF. The proposed construction uses the vectors of the finite-dimensional frame as modulating functions of a modulated filter bank. Like the usual critically sampled modulated filter banks, the oversampled filter banks obtained by the proposed construction can be made time variant without losing the UNTF condition.
Chapter
Spectral Band Replication (SBR) is an enhancement compression technology. The SBR is a bandwidth extension method which significantly improves the compression efficiency of perceptual audio and speech coding schemes. There are two versions of the SBR technology: Standard SBR and Low Delay SBR (LD-BR). Central to the operation of standard SBR and LD-SBR are dedicated complex exponential-modulated and real-valued cosine-modulated quadrature mirror filter (QMF) banks as the basic mathematical tools to analyze and synthesize audio signals. This chapter presents the complete unified efficient implementations of complex exponential-modulated and real-valued cosine-modulated QMF banks used both in the standard SBR and LD-SBR encoder and decoder. In general, for each QMF bank is presented: Definition in its equivalent block transform with a common parameter M representing the number of sub-bands, its general symmetry property in the frequency or time domain, and the derivation of a fast algorithm for its efficient implementation. All the fast algorithms are analyzed in detail in terms of the arithmetic complexity, regularity, and structural simplicity for a potential real-time low-cost implementation in hardware or software.
Chapter
The MPEG committee has recently completed development of a new audio coding standard, the MPEG-4 Advanced Audio Coding-Enhanced Low Delay (AAC-ELD). State-of-the-art MPEG audio coding standards, such as MPEG-4 AAC Low Complexity (AAC-LC), High Efficiency AAC (HE-AAC), and AAC Low Delay (AAC-LD), utilize the time-to-frequency transformation of an audio block and vice versa, the well-known time domain aliasing cancellation modified discrete cosine transform (TDAC-MDCT). In order to achieve low algorithmic delay, the AAC-ELD has adopted a perfect reconstruction low delay cosine-modulated filter bank, called the low delay MDCT (LD-MDCT). Although the use of LD-MDCT substantially reduces the algorithmic delays, the transform operations in the AAC-ELD codec are still computationally intensive and the LD-MDCT filter banks need to have fast algorithms. Therefore, this chapter is concentrated on the analysis/synthesis LD-MDCT filter banks used the AAC-ELD codec and mainly on their efficient implementations. This chapter presents: Definitions of the analysis/synthesis LD-MDCT (and TDAC-MDCT) filter banks, general symmetry properties of LD-MDCT transforms both in the time and frequency domains, relations between the LD-MDCT and TDAC-MDCT transforms in the analytical forms as well as in the equivalent matrix representations, and efficient implementations of the even-length analysis/synthesis LD-MDCT filter banks. For each fast LD-MDCT algorithm the complete formulae are derived. All the fast even-length LD-MDCT algorithms are investigated and compared in terms of arithmetic complexity and structural simplicity.
Conference Paper
In this paper, we consider optimizing the prototype filter of biorthogonal cosine-modulated filter banks in order to reduce overflow occurrence in a fixed-point implementation. We assume that the wordlength of the fixed-point implementation is constant throughout the implementation. We implement the polyphase filters in a form which is inherent to perfect reconstruction. However, the frequency response is subject to fixed-point error, most dominantly overflow. We demonstrate that the floating-point prototype filter with the lowest stopband energy does not result in the fixed-point implementation with the best performance. Based on this result, we show how the fixed-point performance can be improved by preventing amplification by large filter coefficients and by modifying the cost function of the optimization in such a way that all filters have a similar gain. Since the filter bank allows an integer-to-integer mapping with no increase in wordlength, it is well suited for lossless compression algorithms as well as a fast and inexpensive implementation on a hardware platform with fixed-point number format.
Conference Paper
Wireless systems are often subject to bandwidth or cost constraints which are incompatible with high data rates. The key enabling technology for digital audio wireless products is data compression. For real time wireless transmission, very low encoding and decoding delay has become an essential prerequisite. In live productions, the tolerable total delay time is less than a few milliseconds. Current audio coding schemes like MPEG standards or wavelet techniques can hardly reach such a threshold by using overlapping frames of input signal with psychoacoustic model. This paper presents a two dimensional (2D) spatial-frequency processing based audio coder with ultra low delay for real time wireless applications using non-overlapping short block processing and embedded coding. 2D fast lifting wavelet transform with boundary effects minimized is developed for further exploring the correlation of the audio signal. A modified 2D SPIHT (set partitioning in hierarchical trees) algorithm with more bits used to encode the wavelet coefficients and transmitting fewer bits in the sorting pass, is implemented to reduce the correlation between the coefficients at different decomposition levels and inside each band at scalable bit rates. The experiment shows the proposed coder is efficient and has low complexity with less memory requirements in implementation
Chapter
One of the topics in multi-rate digital signal processing is the theory and design of M-band (or M-channel) analysis and synthesis quadrature mirror filter (QMF) banks for sub-band signal decomposition and coding. They are also called M-band maximally decimated critically sampled QMF banks. The analysis QMF bank consists of M uniform and equally spaced channel filters to decompose the input signal into M sub-band signals. The synthesis QMF bank consists of channel filters to reconstruct the original signal exactly from sub-band signals, or to recover a signal which is nearly perfect approximation of the original signal. Historically, discovering the 2-band QMF banks in 1976 stimulated and started research activities leading to extending the theory of near-perfect and perfect reconstruction QMF banks for arbitrary number of sub-bands, to developing a family of near-perfect modulated filter banks (or pseudo-QMF banks) and perfect reconstruction modulated filter banks based on the concept of time domain aliasing cancellation.
Article
Time/Space varying filter banks (FB) are useful for non-stationary images. Lifting factorization of FBs results in structural perfect reconstruction even during the transition from one FB to other. This allows spatial switching between arbitrary FBs, avoiding the need to design border FBs. However, we show that lifting based switching between arbitrarily designed FBs induces spurious transients in the subbands during the transition. In this paper, we study the transients in lifting based switching of two-channel FBs. We propose two solutions to overcome the transients. One solution consists of a boundary handling mechanism to switch between any arbitrarily designed FBs, while the other solution proposes to design the FBs with a set of conditions applied on lifting steps. Both solutions maintain good frequency response during the transition and eliminate the transients. Using the proposed methods, we develop a spatial adaptive transform by switching between the long length FBs (either the JPEG2000 9/7 FB or the newly designed 13/11 FB) and the short length FBs (JPEG2000 5/3 FB) for lossy image compression. This adaptive transform shows PSNR improvement for images over JPEG2000 9/7 FB in low bit rate region (upto 0.2 bpp) and subjective improvements with reduced ringing upto medium bit rates (upto 0.6 bpp).
Conference Paper
Full-text available
In this paper, we consider the implementation of the prototype filter of biorthogonal cosine-modulated filter banks on processors with fixed-point arithmetic. The realization is based on zero-delay and maximum-delay matrices. We show that the perfect reconstruction property of the filter bank is not affected by quantization, rounding and overflow. We also demonstrate how the latter operations influence the frequency selectivity of the filter bank
Article
This paper describes a design method of cosine-modulated filter banks (CMFB's) for an efficient coding of images. Whereas the CMFB has advantages of low design and implementation cost, subband filters of the CMFB do not have linear phase property. This prevents from employing a symmetric extension in transformation process, and leads to a degradation of the image compression performance. However, a recently proposed smooth extension alleviates the problem with CMFB's. As a result, well-designed CMFB's can be expected to be good candidates for a transform block in image compression applications. In this paper, we present a novel design approach of regular CMFB's. After introducing a regularity constraint on lattice parameters of a prototype filter in paraunitary (PU) CMFB's, we also derive a regularity condition for perfect reconstruction (PR) CMFB's. Finally, we design regular 8-channel PUCMFB and PRCMFB by an unconstrained optimization of residual lattice parameters, and several simulation results for test images are compared with various transforms for evaluating the proposed image coder based on the CMFB's with one degree of regularity. In addition, we show a computational complexity of the designed CMFB's.
Article
The MPEG committee has recently completed development of a new audio coding standard “MPEG-4 Advanced Audio Coding—Enhanced Low Delay” (AAC-ELD). AAC-ELD is targeted towards high-quality, full-duplex communication applications such as audio and video conferencing. AAC-ELD uses low delay spectral band replication (LD-SBR) technology together with a low delay AAC core encoder to achieve high coding efficiency and low algorithmic delays. In this paper, we present fast algorithms for computing LD-SBR filterbanks in AAC-ELD. The proposed algorithms map complex exponential modulation portion of the filterbanks to discrete cosine transforms of types IV and II. Our proposed mapping also allows to merge some multiplications with the windowing stage that precedes or succeeds the modulation step. This further reduces computational complexity. Our presentation includes detailed explanation and flow-graphs of the algorithms, complexity analysis, and comparisons with alternative implementations.
Article
Spectral Band Replication (SBR) is an enhancement compression technology, a bandwidth extension method which significantly improves the compression efficiency of perceptual audio and speech coding schemes. There are two versions of the SBR technology: Standard SBR and low delay SBR (LD-SBR). Central to the operation of standard and LD-SBR are dedicated complex exponential-modulated and real-valued cosine-modulated quadrature mirror filter (QMF) banks as the basic mathematical tools to analyze and synthesize audio signals. This tutorial paper presents the complete unified efficient implementations of complex exponential-modulated and real-valued cosine-modulated QMF banks used both in the standard SBR and LD-SBR encoder and decoder. In general, for each QMF bank is presented: Definition of its equivalent block transform with a common parameter M representing the number of sub-bands, its general symmetry property in the frequency or time domain, and the derivation of a fast algorithm for its efficient implementation. All fast algorithms are analyzed in detail in terms of the arithmetic complexity, regularity and structural simplicity for a potential real-time low-cost implementation in hardware or software.
Article
We describe an efficient system, which directly extracts features from compressed audio material. It consists of a time-frequency conversion method and a feature extraction algorithm. The conversion method provides the feature extraction algorithm with a suitable complex spectral representation directly from the compressed domain, and further allows to adjust its computational complexity. Several operating points using different computational complexities were tested with an MPEG audio identification system to evaluate the identification performance. Based on our implementations, we found that our direct conversion is about a factor of 1.6 to 3.3 faster than the conventional full decoding.
Conference Paper
Unimodular filterbanks (UMFBs) have the lowest possible system delay of M - 1 samples, where M is the number of channels. In this paper we make use of this property to show that finite length signals can be processed using UMFBs without any border distortions. More specifically, we show that perfect reconstruction (PR) is achieved without increasing the number of samples at the analysis output, and without any specially designed boundary filters. We provide an extremely simple proof for this remarkable property. We also exploit this property to construct time varying UMFBs without any boundary filters. The UMFB is designed based on a simple canonical structure of an associated nilpotent matrix, with minimum number of free parameters. The optimal set of parameters is obtained through unconstrained optimization. The low computation overhead makes this design suitable for simple and efficient hardware implementation.
Article
Fractal additive synthesis (FAS) is a method for the synthesis-by-analysis of voiced-sounds. FAS is based on a harmonic version of the wavelet transform (WT), and it allows to define two different parametric models for the deterministic and the stochastic components of sounds, respectively. In this letter, we introduce a pitch-synchronous (PS) extension of FAS. By adapting an already existing method for the design of time-varying cosine modulated filter banks (TV-CMFB), we are now able to deal with a much wider class of voiced-sounds, i.e., any voiced-sound showing a variable pitch as in the case, for example, of glissando or vibrato effects.
Article
Classical design of filterbanks results in equal order inverses, thus resulting in filterbanks with equal analysis and synthesis filter length. While it is known that filterbanks may have unequal orders, there does not exist a simple, systematic method to design a $p$th-order analysis and $q$th-order synthesis filterbank where $p ne q$. Also, the design criteria of the $p$ th-order analysis having $q$ th-order synthesis filters ($p ne q$) with a flexibility to control the system delay has never been addressed concomitantly. In this work, we propose the use of unit non-diagonal matrix polynomials to design such filterbanks for given orders $p$, $q$ with arbitrary delay. We demonstrate how these orders depend on the sequence of unit matrices. Not only the achievable ranges of $p$ , $q$ are found, but construction methods for all such cases are also suggested. The proposed design has several desirable properties such as structural perfect reconstruction in its factorized form, completeness, and ability to control the resulting system delay of the filterbank. Several design examples and an application example for audio illustrate the flexibility and usefulness of the proposed design.
Conference Paper
Literature shows that the design criteria of pth order analysis having qth order synthesis filters (p ≠ q) with a flexibility to control the system delay has never been addressed concomitantly. In this paper, we propose a systematic design for a filterbank that can have arbitrary delay with a (p, q) order. Such filterbanks play an important role especially in applications where low delay-high quality signals are required, like a digital hearing aid.
Article
We present a new method for the design of the cosine-modulated filter bank with low complexity and low delay. The design of such a filter bank is formulated as a nonconvex optimization problem with equality constraints. An iterative scheme is developed to solve this optimization problem. To illustrate the effectiveness of the proposed method, a numerical example is presented. Comparing our method with that in [14], our approach can reduce the number of filter taps by up to 40 % while maintaining a relatively low delay.
Chapter
In this chapter, the principles of audio coding will be described, with emphasis on low delay audio coding. Audio coding is based on psycho-acoustic masking effects, as computed by psycho-acoustic models. To use the masking effects and to obtain a good compression ratio, filter banks are used. The principles of psycho-acoustics and of the design of filter banks are presented. Further a new low delay audio coding scheme based on prediction is shown.
Chapter
This chapter introduces time/frequency decompositions in general and the design of filter banks with perfect reconstruction and near perfect reconstruction for audio coding. First it describes their theoretical foundations, including down- and up-sampling, and the polyphase representation. Then it applies these principles to the design of MDCT filter banks, extended length filter banks, low delay filter banks, and (P)QMF filter banks, including optimizations of their frequency responses. Finally it covers time-varying and switchable filter banks. Assumed knowledge is the basics of digital signals and systems and of information theory.
Article
Full-text available
The family of lapped orthogonal transforms is extended to include basis functions of arbitrary length. Within this new family, the extended lapped transform (ELT) is introduced, as a generalization of the previously reported modulated lapped transform (MLT). Design techniques and fast algorithms for the ELT are presented, as well as examples that demonstrate the good performance of the ELT in signal coding applications. Therefore, the ELT is a promising substitute for traditional block transforms in transform coding systems, and also a good substitute for less efficient filter banks in subband coding systems.< >
Article
Full-text available
First published in 1995, Wavelets and Subband Coding offered a unified view of the exciting field of wavelets and their discrete-time cousins, filter banks, or subband coding. The book developed the theory in both continuous and discrete time, and presented important applications. During the past decade, it filled a useful need in explaining a new view of signal processing based on flexible time-frequency analysis and its applications. Since 2007, the authors now retain the copyright and allow open access to the book.
Article
Full-text available
A new formulation for the analysis and design of modulated filter banks is introduced. The formulation provides a broad range of design flexibility within a compact framework and allows for the design of a variety of computationally efficient modulated filter banks with different numbers of bands and virtually arbitrary lengths. A unique feature of the formulation is that it provides explicit control of the input-to-output system delay in conjunction with perfect reconstruction. Design examples are given to illustrate the methodology
Article
Full-text available
It is well known that FIR filter banks that satisfy the perfect-reconstruction (PR) property can be obtained by cosine modulation of a linear-phase prototype filter of length N=2mM, where M is the number of channels. In this paper, we present a PR cosine-modulated filter bank where the length of the prototype filter is arbitrary. The design is formulated as a quadratic-constrained least-squares optimization problem, where the optimized parameters are the prototype filter coefficients. Additional regularity conditions are imposed on the filter bank to obtain the cosine-modulated orthonormal bases of compactly supported wavelets. Design examples are given
Article
Full-text available
Considers the construction of orthogonal time-varying filter banks. By examining the time domain description of the two-channel orthogonal filter bank the authors find it possible to construct a set of orthogonal boundary filters, which allows to apply the filter bank to one-sided or finite-length signals, without redundancy or distortion. The method is constructive and complete. There is a whole space of orthogonal boundary solutions, and there is considerable freedom for optimization. This may be used to generate subband tree structures where the tree varies over time, and to change between different filter sets. The authors also show that the iteration of discrete-time time-varying filter banks gives continuous-time bases, just as in the stationary case. This gives rise to wavelet, or wavelet packet, bases for half-line and interval regions
Article
Full-text available
The authors obtain a necessary and sufficient condition on the 2 M ( M =number of channels) polyphase components of a linear-phase prototype filter of length N =2 mM (where m =an arbitrary positive integer), such that the polyphase component matrix of the modulated filter is lossless. The losslessness of the polyphase component matrix, in turn, is sufficient to ensure that the analysis/synthesis system satisfies perfect reconstruction (PR). Using this result, a novel design procedure is presented based on the two-channel lossless lattice. This enables the design of a large class of FIR (finite impulse response)-PR filter banks, and includes the N =2 M case. It is shown that this approach requires fewer parameters to be optimized than in the pseudo-QMF (quadrature mirror filter) designs and in the lossless lattice based PR-QMF designs (for equal length filters in the three designs). This advantage becomes significant when designing long filters for large M . The design procedure and its other advantages are described in detail. Design examples and comparisons are included
Article
Full-text available
This paper introduces a new formulation for analysis and design of modulated filter banks. A unique feature of the formulation is that it provides explicit control of the input-to-output system delay. The paper discusses minimum delay filter banks and demonstrates that truly exact reconstruction is possible in this context. The formulation provides a broad range of design flexibility within a compact framework and allows for the design of a variety of computationally efficient modulated filter banks with different numbers of bands and virtually arbitrary lengths. 1 Introduction Subband analysis/synthesis filter banks have been a topic of vigorous study for many years now. In the most common mode of operation, an analysis filter bank first splits the input signal into several frequency bands and then decimates each subband to its Nyquist sampling rate. The synthesis filter bank performs the dual operation by upsampling the subbands, and then filtering them to remove the imaged spectral...
Book
The scientists and engineers of today are relentless in their continuing study and analysis of the world about us from the microcosm to the macrocosm. A central purpose of this study is to gain sufficient scientific information and insight to enable the development of both representative and useful models of the superabundance of physical processes that surround us. The engineers need these models and the associated insight in order to build the information processing systems and control systems that comprise these new and emerging technologies. Much of the early modeling work that has been done on these systems has been based on the linear time-invariant system theory and its extensive use of Fourier transform theory for both continuous and discrete systems and signals. However many of the signals arising in nature and real systems are neither stationary nor linear but tend to be concentrated in both time and frequency. Hence a new methodology is needed to take these factors properly into account.
Article
The method of bit rate reduction for audio signals presented in this paper is based on overlapping transforms with 'Time Domain Aliasing Cancellation'. By adapting the window functions and the transform lengths to the input signal, the performance of the transform coder with overlapping blocks in presence of impulses and rapid attacks in the input signal is improved.
Article
The design of multidimensional filter banks and wavelets have been areas of active research for use in video and image communication systems. At the same time efficient structures for the implementation of such filters are of importance. In 1-D, the well known lattice structure and the recently introduced ladder structure are attractive. However, their extensions to higher dimensions (m-D) have been limited. In this paper we reintroduce the ladder structure, with the purpose of transforming the structure into m-D using the McClellan transform.
Article
Perfect reconstruction (PR) FIR filter banks, obtained by modulation of a linear-phase, lowpass, prototype filter and of length 2Mm are well known. Recently, PR modulated filter banks (MFBs) with the analysis and synthesis banks obtained from different prototypes have been reported. This paper describes a general form of modulation that includes modulations used in the literature. This modulation depends on an integer parameter, the modulation phase. The PR property is characterized for MFBs with finite and infinite impulse response filters. The MFB PR problem reduces to roughly M/2 two-channel PR problems. A natural dichotomy in the PR conditions leads us to the concepts of Type 1 and Type 2 MFBs. Unitary MFBs are characterized by the M/2 two-channel PR filter banks also being unitary (for FIR filters of length N = 2Mm, these results are given in (Malvar, Electr Lett. 26, June 1990, 906–907; Koilpillai and Vaidyanathan, IEEE Trans. SP40, No. 4, Apr. 1992, 770–783)). We also give a necessary and sufficient condition for a large class (including FIR) unitary MFB prototypes to have symmetric (even or odd) prototype filters, and exhibit unitary MFBs without symmetric prototypes. A parameterization of all FIR unitary MFBs is also given. An efficient design procedure for FIR unitary MFBs is developed. It turns out that MFBs can be implemented efficiently using Type III and Type IV DCTs. Compactly supported modulated wavelet tight frames are shown to exist and completely parameterized. K-regular modulated WTFs are designed numerically and analytically by solving a set of non-linear equations over the parameters. Design of optimal modulated WTFs for the representation of any given signal is described with examples, and this is used to design smooth modulated WTFs.
Article
The concept of time-varying filter bank structures and their design are considered. In such structures, the analysis filter banks and the corresponding reconstruction filter banks are changed with time. The reconstruction problem is studied and it is shown that perfect reconstruction can be achieved. Both general and periodically time-varying structures are considered. A time-domain formulation of the time-varying structures is presented. Based on this formulation, a design approach is presented. The design approach is illustrated in the context of a complete design example.
Article
The authors present a simple derivation of a parallel filterbank based on cosine-modulated versions of a model low-pass filter. With a nonuniform channel separation an efficient implementation consisting of a DFT (discrete Fourier transform) related transform and subfilters is possible. Using critical sampling of each channel and FIR (finite impulse response) filters, the conditions for perfect reconstruction are given. The computational complexity of the derived FIR filterbank is much lower than for a tree-structured FIR filterbank but cannot compete with the most efficient IIR filterbanks.
Book
Multirate digital signal processing techniques have been practiced by engineers for more than a decade and a half. This discipline finds applications in speech and image compression, the digital audio industry, statistical and adaptive signal processing, numerical solution of differential equations, and in many other fields. It also fits naturally with certain special classes of time-frequency representations such as the short-time Fourier transform and the wavelet transform, which are useful in analyzing the time-varying nature of signal spectra. Over the last decade, there has been a tremendous growth of activity in the area of multirate signal processing, perhaps triggered by the first book in this field [Crochiere and Rabiner, 1983]. Particularly impressive is the amount of new literature in digital filter banks, multidimensional multirate systems, and wavelet representations. The theoretical work in multirate filter banks appears to have reached a level of maturity which justifies a thorough, unified, and in-depth treatment of these topics. This book is intended to serve that purpose, and it presents the above mentioned topics under one cover. Research in the areas of multidimensional systems and wavelet transforms is still proceeding at a rapid rate. We have dedicated one chapter to each of these, in order to bring the reader up to a point where research can be begun. I have always believed that it is important to appreciate the generality of principles and to obtain a solid theoretical foundation, and my presentation here reflects this philosophy. Several applications are discussed throughout the book, but the general principles are presented without bias towards specific application-oriented detail. The writing style here is very much in the form of a text. Whenever possible I have included examples to demonstrate new principles. Many design examples and complete design rules for filter banks have been included. Each chapter includes a fairly extensive set of homework problems (totaling over 300). The solutions to these are available to instructors, from the publisher. Tables and summaries are inserted at many places to enable the reader to locate important results conveniently. I have also tried to simplify the reader’s task by assigning separate chapters for more advanced material. For example, Chap. 11 is dedicated to wavelet transforms, and Chap. 14 contains detailed developments of many results on paraunitary systems. Whenever a result from an advanced chapter (for example, Chap. 14) is used in an earlier chapter, this result is first stated clearly within the context of use, and the reader is referred to the appropriate chapter for proof. The text is self-contained for readers who have some prior exposure to digital signal processing. A one-term course which deals with sampling, discrete-time Fourier transforms, z-transforms, and digital filtering, is sufficient. In Chap. 2 and 3 a brief review of this material is provided. A thorough exposition can be found in a number of references, for example, [Oppenheim and Schafer, 1989]. Chapter 3 also contains some new material, for example, eigenfilters, and detailed discussions on allpass filters, which are very useful in multirate system design. A detailed description of the text can be found in Chap. 1. Chapters 2 and 3 provide a brief review of signals, systems, and digital filtering. Chapter 4, which is the first one on multirate systems, covers the fundamentals of multirate building blocks and filter banks, and describes many applications. Chapter 5 introduces multirate filter banks, laying the theoretical foundation for alias cancelation, and elimination of other errors. The first two sections in Chap. 4 and 5 contain material overlapping with [Crochiere and Rabiner, 1983]. Most of the remaining material in these chapters, and in the majority of the chapters that follow, have not appeared in this form in text books. Chapters 6 to 8 provide a deeper study of multirate filter banks, and present several design techniques, including those based on the so-called paraunitary matrices. (These matrices play a role in the design of many multirate systems, and are treated in full depth in Chap. 14.) Chapters 9 to 12 cover special topics in multirate signal processing. These include roundoff noise effects (Chap. 9), block filtering, periodically time varying systems and sampling theorems (Chap. 10), wavelet trans-forms (Chap. 11) and multidimensional multirate systems (Chap. 12). Chapters 13 and 14 give an in-depth coverage of multivariable linear systems and lossless (or paraunitary) systems, which are required for a deeper understanding of multirate filter banks and wavelet transforms. There are five appendices which serve as references as well as supplementary reading. Three of these are review-material (matrix theory, random processes, and Mason’s gain formula). Two of the appendices contain results directly related to filter bank systems. One of these is a technique for spectral factorization; the other one analyzes the effects of quantization of subband signals. Many of these chapters have been taught at Caltech over the last three years. This text can be used for teaching a one, two, or three term (quarter or semester) course on one of many possible topics, for example, multirate fundamentals, multirate filter banks, wavelet representation, and so on. There are many homework problems. The instructor has a great deal of flexibility in ch∞sing the topics, but I prefer not to bias him or her by providing specific course outlines here. In summary, I have endeavored to produce a text which is useful for the class-room, as well as for self-study. It is also hoped that it will bring the reader to a point where he/she can start pursuing research in a vast range of multirate areas. Finally, I believe that the text can be comfortably used by the practicing engineer because of the inclusion of several design procedures, examples, tables, and summaries.
Chapter
This paper is essentially tutorial in nature. We show how any discrete wavelet transform or two band subband ltering with nite lters can be decomposed into a nite sequence of simple lter - ing steps, which we call lifting steps but that are also known as ladder structures. This decomposition corresponds to a factorization of the polyphase matrix of the wavelet or subband lters into elementary matrices. That such a factorization is possible is well-known to algebraists (and expressed by the formula ); it is also used in linear systems theory in the electrical engineering community. We present here a self-contained derivation, building the decomposition from basic principles such as the Euclidean algorithm, with a focus on applying it to wavelet ltering. This factorization provides an alternative for the lattice factorization, with the advantage that it can also be used in the biorthogonal, i.e, non-unitary case. Like the lattice factorization, the decomposition presented here asymptotically re- duces the computational complexity of the transform by a factor two. It has other applications, such as the possibility of dening a wavelet-like transform that maps integers to integers.
Conference Paper
The authors present a derivation of the necessary and sufficient condition on the polyphase components of a linear-phase prototype (length N =2 mM ) such that the polyphase component matrix of the filter bank is lossless. This in turn ensures that the modulated filter bank satisfies the PR (perfect reconstruction) property. An efficient design procedure (which involves fewer parameters to be optimized than other approaches) is presented. By this approach, FIR (finite-impulse-response) PR filter banks can be designed for an arbitrary number of channels. Moreover, since both the analysis and synthesis filter banks are obtained by modulation, they can be implemented very efficiently (using the discrete cosine transform). The design procedure is outlined and a design example is presented
Conference Paper
A new filter structure and design method for time-varying cosine modulated FIR filter banks with critical sampling, perfect reconstruction, and an efficient implementation is presented. The proposed filter banks have an arbitrary system delay which can be chosen in the design process and is independent of the arbitrary filter length, hence making a low system delay possible. The time variation includes changing the number of bands and/or filters during signal processing while maintaining critical sampling and perfect reconstruction. The transition windows can be overlapping, which improves the frequency responses. It is based on a factorization of the polyphase matrices into a cascade of 2 types of simple matrices
Article
The authors consider expansions which give arbitrary orthonormal tilings of the time-frequency plane. These differ from the short-time Fourier transform, wavelet transform, and wavelet packets tilings in that they change over time. They show how this can be achieved using time-varying orthogonal tree structures, which preserve orthogonality, even across transitions. The method is based on the construction of boundary and transition filters; these allow us to construct essentially arbitrary tilings. Time-varying modulated lapped transforms are a special case, where both boundary and overlapping solutions are possible with filters obtained by modulation. They present a double-tree algorithm which for a given signal decides on the best binary segmentation in both time and frequency. That is, it is a joint optimization of time and frequency splitting. The algorithm is optimal for additive cost functions (e.g., rate-distortion), and results in time-varying best bases, the main application of which is for compression of nonstationary signals. Experiments on test signals are presented
Conference Paper
A new design method for biorthogonal modulated filter banks is presented. It is based on a cascade of simple matrices, and it has some properties that have not been reported before. It represents filter banks with arbitrary overall system delay and filter length, it is shown that almost all cosine modulated filter banks can be described by this structure, and that it leads to a more efficient implementation than previous structures. Imposing certain symmetries on the matrices can be used to design low delay filter banks with identical (except for the sign) baseband impulse responses for the analysis and synthesis filter bank.
Conference Paper
We introduce a polyphase representation for linear time-varying (LTV) filters. Using the proposed polyphase approach, we are able to show some unusual properties of LTV filter bank (FB) that have not been pointed out before. The problem of interchangeability of analysis and synthesis banks is considered. Lossless TVFBs is studied in detail. Perfect reconstruction (PR) can always be obtained for lossless TVFBs. However, in general a PR lossless TVFB will only generate a tight frame
Conference Paper
This paper generalizes and unifies well-known results on modulated filter banks (MFBs) and modulated wavelet tight frames (MWTFs). It classifies MFBs based on the discrete cosine or sine transforms that they are associated with. By proper choice of the form of modulation the perfect reconstruction (PR) conditions are seen to be (surprisingly) identical for all classes of MFBs. This has the interesting consequence that optimal MFB prototype designs can be shared across MFB classes. For some classes of MFBs associated MWTFs do not exist, while for others they do. The results cover both orthogonal and biorthogonal MFBs; and the filters could be arbitrary sequences in l<sup>2</sup>( Z )
Conference Paper
Cosine-modulated filter banks have been studied extensively because of their design ease and efficient implementation. These filter banks either have restricted lengths or assume the paraunitary property for the polyphase matrices. In this paper, the biorthogonal cosine-modulated filter bank with arbitrary length is considered and the perfect reconstruction (PR) conditions are derived. These conditions are the general form of the PR conditions reported in the literature. Examples of PR systems with variable overall delay are designed using the quadratic constrained least squares formulation
Conference Paper
The authors examine some of the analysis/synthesis issues associated with FIR (finite impulse response) time-varying filter banks where the filter bank coefficients are allowed to change in response to the input signal. Several issues are identified as being important for realizing performance gains from time-varying filter banks in image coding applications. These issues relate to the behavior of the filters as transition from one set of filter banks to another occurs. Lattice structure formulations for the time-varying filter bank problems are introduced and discussed in terms of their properties and transition characteristics.< >
Conference Paper
It is well known that FIR (finite impulse response) filter banks which satisfy the perfect reconstruction (PR) property can be obtained by cosine modulation of a linear-phase prototype filter of length N =2 mM , where M is the number of channels. Moreover, the overall delay of this system is (2 mM -1) samples. The author presents a PR cosine-modulated filter bank in which its overall delay is (2(α+1) M -1) samples, where α is a positive integer in the range 0&les;α&les;2 m -2. Consequently, as is shown, the prototype filter of the proposed filter bank no longer has linear phase
Conference Paper
The concept of time-varying filter bank structures and their design are considered. In such structures, the analysis filter banks and the corresponding reconstruction filter banks are changed with time. The reconstruction problem is studied and it is shown that perfect reconstruction can be achieved. Both general and periodically time-varying structures are considered. A time-domain formulation of the time-varying structures is presented. Based on this formulation, a design approach is presented. The design approach is illustrated in the context of a complete design example
Article
This paper presents a general framework for maximally decimated modulated filter banks. The theory covers the known classes of cosine modulation and relates them to complex-modulated filter banks. The prototype filters have arbitrary lengths, and the overall delay of the filter bank is arbitrary, within fundamental limits. Necessary and sufficient conditions for perfect reconstruction (PR) are derived using the polyphase representation. It is shown that these PR conditions are identical for all types of modulation-modulation based on the discrete cosine transform (DCT), both DCT-III/DCT-IV and DCT-I/DCT-II, and modulation based on the modified discrete Fourier transform (MDFT). A quadratic-constrained design method for prototype filters yielding PR with arbitrary length and system delay is derived, and design examples are presented to illustrate the tradeoff between overall system delay and stopband attenuation (subchannelization)
Article
A structure for implementing lapped transforms with time-varying block sizes that allows full orthogonality of the transient transforms is presented. The formulation is based on a factorization of the transfer matrix into orthogonal factors. Such an approach can be viewed as a sequence of stages with variable-block-size transforms separated by sample-shuffling (delay) stages. Details and design examples for a first-order system are presented
Article
Perfect reconstruction (PR) time-varying analysis-synthesis filter banks are those in which the filters are allowed to change from one set of PR filter banks to another as the input signal is being processed. Such systems have the property that, in the absence of coding, they faithfully reconstruct every sample of the input. Various methods have been reported for the time-varying filter bank design; all of them, however, utilize structures for conventional PR filter banks. These conventional structures that have been applied in the past result in different limitations in each method. This paper introduces a new structure for exactly reconstructing time-varying analysis-synthesis filter banks. This structure consists of the conventional filter bank followed by a time-varying post filter. The new method requires neither the redesign of the analysis sections nor the use of any intermediate analysis filters during transition periods. It provides a simple and elegant procedure for designing time-varying filter banks without the disadvantages of the previous methods
Article
A complete factorization of all optimal (in terms of quick transition) time-varying FIR unitary filter bank tree topologies is obtained. This has applications in adaptive subband coding, tiling of the time-frequency plane and the construction of orthonormal wavelet and wavelet packet bases for the half-line and interval. For an M-channel filter bank the factorization allows one to construct entry/exit filters that allow the filter bank to be used on finite signals without distortion at the boundaries. One of the advantages of the approach is that an efficient implementation algorithm comes with the factorization. The factorization can be used to generate filter bank tree-structures where the tree topology changes over time. Explicit formulas for the transition filters are obtained for arbitrary tree transitions. The results hold for tree structures where filter banks with any number of channels or filters of any length are used. Time-varying wavelet and wavelet packet bases are also constructed using these filter bank structures. the present construction of wavelets is unique in several ways: 1) the number of entry/exit functions is equal to the number of entry/exit filters of the corresponding filter bank; 2) these functions are defined as linear combinations of the scaling functions-other methods involve infinite product constructions; 3) the functions are trivially as regular as the wavelet bases from which they are constructed
Article
Presents an effective design algorithm for analysis-synthesis filter banks with computationally efficient structures. Although a wide variety of implementation structures can be accommodated, the focus of the paper is on cosine modulated filter banks. The design procedure is based on a time domain formulation of analysis-synthesis filter banks in which each individual channel filter is constrained to be a cosine modulated versions of a baseband filter. The resulting filter banks are very efficient in terms of computational requirements and are relatively easy to design. A unique feature of this approach is that relatively low reconstruction delays can be imposed on the system. A discussion of the associated computational properties of the designed systems and some design examples are included
Article
The subject of this paper is the design of low and minimum delay, exact reconstruction analysis-synthesis systems based on filter banks. It presents a time domain approach to the problem of designing FIR filter banks with adjustable reconstruction delays. It is shown that using a time domain formulation for the analysis-synthesis systems, the system delay can be considered to be relatively independent of the length of the analysis and synthesis filters. After a summary of the time domain analysis and design framework, the design of low and minimum delay systems is considered in detail. Several design examples are provided in the paper to demonstrate the performance of the design algorithm
A single-sideband analysis/synthesis system is proposed which provides perfect reconstruction of a signal from a set of critically sampled analysis signals. The technique is developed in terms of a weighted overlap-add method of analysis/synthesis and allows overlap between adjacent time windows. This implies that time domain aliasing is introduced in the analysis; however, this aliasing is cancelled in the synthesis process, and the system can provide perfect reconstruction. Achieving perfect reconstruction places constraints on the time domain window shape which are equivalent to those placed on the frequency domain shape of analysis/synthesis channels used in recently proposed critically sampled systems based on frequency domain aliasing cancellation [7], [8]. In fact, a duality exists between the new technique and the frequency domain techniques of [7] and [8], The proposed technique is more efficient than frequency domain designs for a given number of analysis/synthesis channels, and can provide reasonably band-limited channel responses. The technique could be particularly useful in applications where critically sampled analysis/synthesis is desirable, e.g., coding.
Article
In this paper we present the basic idea behind the lifting scheme, a new construction of biorthogonal wavelets which does not use the Fourier transform. In contrast with earlier papers we introduce lifting purely from a wavelet transform point of view and only consider the wavelet basis functions in a later stage. We show how lifting leads to a faster, fully in-place implementation of the wavelet transform. Moreover, it can be used in the construction of second generation wavelets, wavelets that are not necessarily translates and dilates of one function. A typical example of the latter are wavelets on the sphere. Keywords: wavelet, biorthogonal, in-place calculation, lifting 1 Introduction At the present day it has become virtually impossible to give the definition of a "wavelet". The research field is growing so fast and novel contributions are made at such a rate that even if one manages to give a definition today, it might be obsolete tomorrow. One, very vague, way of thinking about...
Article
. The lifting scheme is a new flexible tool for constructing wavelets and wavelet transforms. In this paper, we use the Euclidean algorithm to show how any discrete wavelet transform or two band subband transform with finite filters can be obtained with a finite number of lifting steps starting from the Lazy wavelet (or polyphase transform). We show a bound on the number of lifting steps that is proportional to the length of the filters. This factorization provides an alternative for the lattice factorization, with the advantage that it can also be used in the biorthogonal (non-unitary) case. The lifting factorization asymptotically reduces the computational complexity of the transform by a factor of two and allows for wavelet transforms that map integers to integers. 1. Introduction Over the last decade several constructions of compactly supported wavelets originated both from mathematical analysis and the signal processing community. The roots of critically sampled wavelet ...
Article
A new filter structure and design method for time-varying cosine modulated FIR filter banks with critical sampling, perfect reconstruction, and an efficient implementation is presented. The proposed filter banks have an arbitrary system delay which can be chosen in the design process and is independent of the arbitrary filter length, hence making a low system delay possible. The time variation includes changing the number of bands and/or filters during signal processing while maintaining critical sampling and perfect reconstruction. The transition windows can be overlapping, which improves the frequency responses. It is based on a factorization of the polyphase matrices into a cascade of 2 types of simple matrices. 1. INTRODUCTION The system considered here is an N band (N not necessarily even) cosine modulated FIR filter bank with critical downsampling and perfect reconstruction (PR). If a support preservative PR filter bank is time varying the filters and sometimes also the number ...
Codingofaudiosignalswithoverlappingblocktransformand adaptive window functions Englewood Norwell, MA: Kovacevic, Wavelets andSubband r748IEEE
  • B Edler
B.Edler,“Codingofaudiosignalswithoverlappingblocktransformand adaptive window functions (in German),” Frequenz, pp. 252–256, Sept. 1989. Englewood Norwell, MA: Kovacevic, Wavelets andSubband r748IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 48, NO. 3, MARCH 2000