Content uploaded by Manuel P. Malumbres

Author content

All content in this area was uploaded by Manuel P. Malumbres on May 29, 2014

Content may be subject to copyright.

NOVEMBER 2006 VOLUME 16 NUMBER 11 ITCTEM (ISSN 1051-8215)

TRANSACTIONS PAPERS

Movement Epenthesis Generation Using NURBS-Based Spatial Interpolation ...............................................

.......................................................................................

Z.-J. Chuang, C.-H. Wu, and W.-S. Chen 1313

Joint Prediction Algorithm and Architecture for Stereo Video Hybrid Coding Systems ......................................

........................................................................................ L.-F. Ding, S.-Y. Chien, and L.-G. Chen 1324

Progressive Coding of 3-D Objects Based on Overcomplete Decompositions ................................................

..................................................................................... I. Tosic, P. Frossard, and P. Vandergheynst 1338

VLSI Design of a Wavelet Processing Core .......................................................... S.-W. Lee and S.-C. Lim 1350

Dynamic Programming-Based Reverse Frame Selection for VBR Video Delivery Under Constrained Resources ........

.................................................................. D. Tao, J. Cai, H. Yi, D. Rajan, L.-T. Chia, and K. N. Ngan 1362

High-Throughput Architecture for H.264/AVC CABAC Compression System .......... R. R. Osorio and J. D. Bruguera 1376

4-D Wavelet-Based Multiview Video Coding .......................... W. Yang, Y. Lu, F. Wu, J. Cai, K. N. Ngan, and S. Li 1385

A Flexible Hardware JPEG 2000 Decoder for Digital Cinema ..................................................................

...................................... A. Descampe, F.-O. Devaux, G. Rouvroy, J.-D. Legat, J.-J. Quisquater, and B. Macq 1397

Improved Super-Resolution Reconstruction From Video ............................................ C. Wang, P. Xue, W. Lin 1411

TRANSACTIONS LETTERS

Reversible Visible Watermarking and Lossless Recovery of Original Images ............................ Y. Hu and B. Jeon 1423

Wyner–Ziv Video Coding With Universal Prediction ............................................ Z. Li, L. Liu, and E. J. Delp 1430

Low-Complexity Multiresolution Image Compression Using Wavelet Lower Trees .... . . J. Oliver and M. P. Malumbres 1437

CALLS FOR PAPERS

Special Issue on Multiview Video Coding ........................................................................................ 1445

Call for Papers and Special Issue Proposals ...................................................................................... 1446

IEEE ISCAS 2007—Call for Participation ....................................................................................... 1447

IEEE ICME 2007 .................................................................................................................. 1448

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 16, NO. 11, NOVEMBER 2006 1437

Low-Complexity Multiresolution Image Compression

Using Wavelet Lower Trees

Jose Oliver, Member, IEEE, and Manuel P. Malumbres, Member, IEEE

Abstract—In this paper, a new image compression algorithm is

proposed based on the efﬁcient construction of wavelet coefﬁcient

lower trees. The main contribution of the proposed lower-tree

wavelet (LTW) encoder is the utilization of coefﬁcient trees, not

only as an efﬁcient method of grouping coefﬁcients, but also as a

fast way of coding them. Thus, it presents state-of-the-art com-

pression performance, whereas its complexity is lower than the

one presented in other wavelet coders, like SPIHT and JPEG 2000.

Fast execution is achieved by means of a simple two-pass coding

and one-pass decoding algorithm. Moreover, its computation does

not require additional lists or complex data structures, so there

is no memory overhead. A formal description of the algorithm is

provided, while reference software is also given. Numerical results

show that our codec works faster than SPIHT and JPEG 2000

(up to three times faster than SPIHT and ﬁfteen times faster than

JPEG 2000), with similar coding efﬁciency.

Index Terms—Image compression, low complexity, tree-based

coding, wavelets.

I. INTRODUCTION

D

URING the last decade, several image compression

schemes emerged in order to overcome the known limi-

tations of block-based algorithms that use the discrete cosine

transform (DCT). Some of these alternative proposals were

based on more complex techniques, like Vector Quantization

[2] and Fractal image coding [3], whereas others simply pro-

posed the use of a different and more suitable mathematical

transform: the discrete wavelet transform (DWT) [4]. At that

time, there was a general idea: more efﬁcient image coders

could only be achieved by means of sophisticated techniques

with high complexity. The embedded zero-tree wavelet coder

(EZW) [5] can be considered the ﬁrst Wavelet image coder

that broke that trend. Since then, many wavelet coders were

proposed and ﬁnally, the DWT was included in the JPEG 2000

standard [6] due to its compression efﬁciency among other

interesting features (scalability, etc.).

All the wavelet-based image coders, and in general all the

transform-based coders, consist of two main stages. In the ﬁrst

one, the image is transformed from spatial domain to another

one, in the case of the wavelet transform a combined spatial-fre-

quency domain, called wavelet domain. In the second pass, the

Manuscript received March 16, 2005; revised May 30, 2006; accepted May

30, 2006. This work was supported by the Spanish Ministry of Education and

Science under grant TIC2003-00339. A preliminary version of this paper was

ﬁrst presented at the IEEE Data Compression Conference, Snowbird, UT, in

March 2003. This paper was recommended by Associate Editor H. Gharavi.

J. Oliver is with the Department of Computer Engineering (DISCA), Poly-

technic University of Valencia, 46022 Valencia, Spain (e-mail: joliver@disca.

upv.es).

M. P. Malumbres is with the Department of Physics and Computer Engi-

neering, Miguel Hernandez University, 03202 Elche, Spain (e-mail: mels@umh.

es).

Digital Object Identiﬁer 10.1109/TCSVT.2006.883505

transform coefﬁcients are quantized and encoded in an efﬁcient

way to achieve high compression efﬁciency and other features.

The wavelet transform can be implemented as a regular

ﬁlter-bank, however several strategies have been proposed to

reduce the running time and memory requirements. This way, a

line-based processing is proposed in [7], whereas an alternative

wavelet transform method, called lifting scheme, is proposed in

[8]. The lifting transform provides in-place calculation of the

coefﬁcients by overwriting the input samples, and reduces the

number of operations required to compute the DWT.

On the other hand, the coding pass is not usually improved

in terms of complexity and memory usage. When designing a

new wavelet image encoder, the most important factor to opti-

mize is usually the rate/distortion (R/D) performance, whereas

other features like embedded bit-stream, signal-to-noise ratio

(SNR) scalability, spatial scalability and error resilience are also

considered. In this paper, we propose an algorithm aiming to

achieve state-of-the-art coding efﬁciency, with very low execu-

tion time. Moreover, due to in-place processing of the coefﬁ-

cients, there is no memory overhead (it only needs memory to

store the source image). In addition, our algorithm is naturally

spatial scalable and it is possible to achieve SNR scalability.

The key idea of the proposed algorithm is the use of wavelet

coefﬁcient trees as a fast method of efﬁciently grouping co-

efﬁcients. Tree-based wavelet coders have been widely used

in the literature [5], [9], [10], [14], presenting good R/D per-

formance. However, their excellent opportunities for fast pro-

cessing of quantized coefﬁcients have not been clearly shown

so far. Wavelet trees are a simple way of grouping coefﬁcients,

reducing the total number of symbols to be coded, which in-

volves not only good compression performance but also fast

processing. Moreover, for a low-complexity implementation,

bit-plane coding, present in many wavelet coders [5], [6], [9],

[11], [12], must be avoided. This way, multiple scans of the

transform coefﬁcients, which involves many memory accesses

and causes high cache miss rates, is not performed, at the ex-

pense of generating a non-(SNR)-embedded bitstream.

This paper is organized as follows. In Section II, we analyze

the complexity and coding efﬁciency of some important wavelet

image coders, focusing on the non-embedded proposals. In Sec-

tion III, the proposed tree-based algorithm is described. Sec-

tion IV describes some implementation details and optimization

considerations. Finally, in Section V, we compare our proposal

with other wavelet image coders using real implementations.

II. P

REVIOUS WAVELET IMAGE CODERS AND

THEIR COMPLEXITY

One of the ﬁrst efﬁcient wavelet image coders reported in the

literature is EZW [5]. It is based on the construction of coefﬁ-

cient-trees and successive-approximations, which can be imple-

1051-8215/$20.00 © 2006 IEEE

1438 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 16, NO. 11, NOVEMBER 2006

mented with bit-plane coding. Due to its successive-approxima-

tion nature, it is SNR scalable, although at the expense of spa-

tial scalability. SPIHT [9] is an advanced version of this algo-

rithm, where coefﬁcient-trees are processed in a more efﬁcient

way. In this coder, coefﬁcient-trees are partitioned depending

on the signiﬁcance of the coefﬁcients belonging to each tree. If

a coefﬁcient in a tree is higher than a threshold determined by

the current bit-plane, the tree is successively divided following

some established partitioning rules. Both EZW and SPIHT need

the computation of coefﬁcient-trees to search for signiﬁcant co-

efﬁcients and they take various iterations focusing on a dif-

ferent bit-plane, which involves high computational complexity.

A block-based version of SPIHT, called SPECK, is presented

in [11]. The main difference of this new version is that coefﬁ-

cients are grouped and partitioned using rectangular structures

instead of coefﬁcient trees. A low-complexity implementation

of SPECK, called SBHP [12], was proposed in the framework

of JPEG 2000. In SBHP, an image is ﬁrst divided into blocks

(sized 64

64 or 128 128) and then, each block is processed

as in SPECK. To reduce complexity, Huffman coding is used

instead of arithmetic coding, which causes a decrease in coding

efﬁciency.

In the ﬁnal JPEG 2000 standard [6], the proposed algorithm

(a modiﬁed version of EBCOT [13]) does not use coefﬁcient-

trees, but it performs bit-plane coding in code-blocks, with three

passes per plane so that the most important information is en-

coded in ﬁrst place. In order to overcome the disadvantage of

not using coefﬁcient-trees, it uses an iterative optimization al-

gorithm, based on the Lagrange multiplier method, along with

a large number of contexts. JPEG 2000 obtains both spatial

and SNR scalability by reordering the encoded image. Hence,

the ﬁnal bitstream is very versatile. However, this encoder is

complex, mainly due to the use of the time-consuming iterative

optimization algorithm, the introduction of a reordering stage,

and the use of bit-plane coding with many contexts. In addi-

tion, a larger bitstream than the one actually encoded is usually

generated.

Many times, wavelet image encoders have features that are

not always needed, but which make them both CPU and memory

intensive. Fast processing is often preferable. Low complexity

may be desirable for high-resolution digital camera shooting,

where high delays could be annoying and even unacceptable

(even more for modern digital cameras, which tend to increase

their resolution). In general, image editing for large images (es-

pecially in GIS applications) cannot be easily tackled with the

complexity of previous encoders.

Since iterative methods and bit-plane coding must be avoided

to reduce complexity, very fast coding can only be achieved

through simpler non-SNR-embedded techniques (like in base-

line JPEG). In these encoders, an image is encoded at a constant

quality after applying a uniform quantization to the coefﬁcients.

A. Non-Embedded Coders

One of the ﬁrst tree-based non-embedded coders is SFQ [10].

This wavelet encoder uses space quantization by means of a

tree-pruning algorithm, which modiﬁes the shape of the trees

by pruning their branches, whereas a scalar quantization is ap-

plied for frequency quantization. Although it achieves higher

compression than SPIHT, the iterative tree pruning stage makes

it about ﬁve times slower than SPIHT.

Non-embedded coding was ﬁrst proposed to reduce com-

plexity in tree-based wavelet coding in [14], where a non-em-

bedded version of SPIHT was proposed. In this modiﬁed

SPIHT, once a coefﬁcient is found to be signiﬁcant, all the

signiﬁcant bits are encoded, avoiding the reﬁnement passes (see

[9] for details). Note that a non-embedded version of SPECK

and SBHP is also possible with the same modiﬁcations. Al-

though these non-embedded versions are faster than the original

ones, neither multiple image scan nor bit-plane processing of

the sorting passes (used to ﬁnd signiﬁcant coefﬁcients) is

avoided, and hence, the complexity problem still remains. Note

that these modiﬁcations of SPIHT and SPECK are neither SNR

nor resolution scalable.

Besides the JPEG standard, other DCT-based non-embedded

image coders have been proposed. In particular, the intra mode

of the H.264 standard [16]. However, due to the use of time-con-

suming prediction techniques on the coding side, the coding/de-

coding processes are very asymmetric, and the resulting encoder

is very slow.

III. M

ULTIRESOLUTION IMAGE CODING

USING LOWER TREES

For the most part, digital images are represented with a set

of pixels,

. The encoder proposed in this paper is applied to

a set of coefﬁcients

resulting from a dyadic decomposition

, with . The most commonly used decomposi-

tion for image compression is the hierarchical wavelet subband

transform [4], thus an element

is called transform co-

efﬁcient. In a wavelet transform, we call

, , and

the subbands resulting from the ﬁrst level of the image decom-

position, corresponding to horizontal, vertical and diagonal fre-

quencies. The rest of the transform is computed with a recursive

wavelet decomposition on the remaining low frequency sub-

band, until a desired decomposition level (N) is achieved (

is the remaining low frequency subband).

In Section II, we mentioned that one of the main drawbacks

in previous wavelet-based image encoders is their high com-

plexity. Many times, that is due to time-consuming iterative

methods, and bit-plane coding used to provide a fully embedded

bitstream. Although embedding is a nice feature in an image

coder, it is not always needed and other alternatives, like spa-

tial scalability, may be more valuable depending on the ﬁnal

application. In this section, we propose a tree-based coding al-

gorithm that is able to encode the wavelet coefﬁcients without

performing an image scan per bit plane.

Tree-based wavelet image encoders are proved to efﬁciently

store the transform coefﬁcients, achieving good performance re-

sults. However, in the algorithm proposed in this paper, a tree-

based structure is introduced, not only to remove redundancy

among subbands, but also as a simple and fast way of grouping

coefﬁcients.

As in the rest of tree-based encoders, coefﬁcients can be log-

ically arranged as trees, as shown in Fig. 1. In this ﬁgure, we

observe that the coefﬁcients in the wavelet subbands (except the

leaves) have always four direct descendants (i.e., four descen-

dants at a distance of one), while the rest of descendants can be

recursively obtained from the direct descendants. On the other

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 16, NO. 11, NOVEMBER 2006 1439

Fig. 1. Coefﬁcient-trees in the proposed algorithm.

hand, in the subband, three out of every 2 2 coefﬁcients

have four direct descendants, and the remaining coefﬁcient in

the 2

2 block has no descendant.

In our proposal, the quantization process is performed with

two strategies: one coarser and another ﬁner. The ﬁner one con-

sists in applying a scalar uniform quantization to the coefﬁ-

cients, and it can be applied with the normalization factor in

the lifting transform (if the normalization factor in the lifting

scheme is

, and the quantization factor is , only a multiplica-

tion by

is needed). The coarser one is based on removing

bit planes from the least signiﬁcant part of the coefﬁcients, and it

is performed while the algorithm is applied. Related to this bit

plane quantization, we deﬁne

as the number of least

signiﬁcant bits to be removed.

We say that a coefﬁcient

is signiﬁcant if it is different

from zero after discarding the least signiﬁcant

bits, in

other words, if

. In our proposal, signiﬁcant co-

efﬁcients are encoded by using arithmetic coding, with a symbol

indicating the number of bits needed to encode that coefﬁcient.

Then, the signiﬁcant bits and sign are binary encoded (in other

words, “raw encoded”).

Regarding the insigniﬁcant coefﬁcients, a coefﬁcient is

called lower-tree root if this coefﬁcient and all its descendants

are insigniﬁcant (i.e., lower than

). The set formed

by all these coefﬁcients is a lower-tree. We use the symbol

to point out that a coefﬁcient is the root of a

lower-tree. The rest of coefﬁcients in the lower-tree are labeled

as

, but are not encoded. On the

other hand, if a coefﬁcient is lower than

, but it does not

form or belong to a lower-tree because it has at least one sig-

niﬁcant descendant, it is considered

.

A. Lower-Tree Encoder Algorithm

Once we have deﬁned the basic concepts to understand the

algorithm, we are ready to describe the coding process. It is

a two-pass algorithm. During the ﬁrst pass, the wavelet coef-

ﬁcients are properly labeled according to their signiﬁcance and

the lower-trees are formed. In the second image pass, the coef-

ﬁcient values are coded by using the labels computed in the ﬁrst

pass.

The coding algorithm is presented in Algorithm 1. Let us de-

scribe this algorithm. At the encoder initialization, the number

of bits needed to represent the highest coefﬁcient

is calculated. This value and the parameter are

output to the decoder. Afterwards, we initialize an adaptive

arithmetic encoder that is used to encode the number of bits

required by the signiﬁcant coefﬁcients, and the

and

symbols.

In the ﬁrst image pass, the lower-tree labeling process is

performed in a recursive way, by building the lower-trees

from leaves to root. In the ﬁrst level subbands, coefﬁ-

cients are scanned in 2

2 blocks and, if the four coef-

ﬁcients are insigniﬁcant (i.e., lower than

), they

are considered part of the same lower-tree, being labeled

as

. Then, when scanning

higher level subbands, if a 2

2 block has four in-

signiﬁcant coefﬁcients, and all their direct descendants are

, the coefﬁcients in that block are

also labeled as

, increasing the

size of the lower-tree.

However, when at least one coefﬁcient in the block is signif-

icant, the lower-tree cannot continue growing. In that case, an

insigniﬁcant coefﬁcient in the block is labeled as

if all

its descendants are

, otherwise an

insigniﬁcant coefﬁcient is labeled as

.

In the second pass, all the subbands are explored from

the Nth level to the ﬁrst one, and all their coefﬁcients are

scanned in medium-sized blocks (to take advantage of

data locality). For each coefﬁcient in a subband, if it is

a lower-tree root or an isolated lower, the corresponding

or symbol is encoded.

On the other hand, if a coefﬁcient has been labeled as

no output is needed because

this coefﬁcient is already represented by the lower-tree to

which it belongs.

A signiﬁcant coefﬁcient is coded as follows. A symbol

indicating the number of bits required to represent that co-

efﬁcient is arithmetically coded, and the signiﬁcant bits and

sign are “raw coded”. However, two types of numeric symbols

are used depending on the direct descendants of that coefﬁ-

cient. (a) A regular numeric symbol

, which simply

shows the number of bits needed to encode a coefﬁcient, (b)

and a special “

numeric symbol” ,

which not only indicates the number of bits of the coefﬁ-

cient but also the fact that its descendants are labeled as

, and thus they belong to a

lower-tree not yet codiﬁed. This type of symbol is able to

represent efﬁciently some special lower-trees, in which the root

coefﬁcient is signiﬁcant and the rest of coefﬁcients are insignif-

icant. Note that the number of symbols needed to represent both

sets of numeric symbols is

, therefore

the arithmetic encoder must be initialized to handle at least

this amount of symbols, along with two additional symbols:

the

and symbols. Observe

that the ﬁrst rplanes bits and the most signiﬁcant non-zero bit

are not encoded (the decoder can deduce the most signiﬁcant

non-zero bit through the arithmetic symbol that indicates the

number of bits required to encode this coefﬁcient).

An important difference between our tree-based wavelet al-

gorithm and others like [5] and [9] is how the coefﬁcient tree

building process is detailed. Our algorithm includes a simple

1440 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 16, NO. 11, NOVEMBER 2006

and efﬁcient recursive method (within the E2 stage) to deter-

mine if a coefﬁcient has signiﬁcant descendants in order to form

coefﬁcient trees. However, EZW and SPIHT leave it as an open

and implementation dependent aspect, which increases dras-

tically the algorithm complexity if it is not carefully solved

(for example, if searches for signiﬁcant descendants are inde-

pendently calculated for every coefﬁcient whenever they are

needed). Anyway, an efﬁcient implementation for EZW and

SPIHT would increase their memory consumption due to the

need to store the maximum descendant for every coefﬁcient ob-

tained during a pre-search stage.

B. Lower-Tree Decoder Algorithm

The decoding algorithm performs the reverse process in only

one pass, since symbols are directly decoded from the incoming

bitstream. All the subbands must be scanned in the order used

in the encoder. The granularity of this scan is 2

2 coefﬁcient

blocks, thus coefﬁcients that share the same parent (i.e., sibling

coefﬁcients) are handled together. Note that all the coefﬁcient

blocks have ascendant except those in the

subband.

When an insigniﬁcant coefﬁcient is decoded, we set its value

to 0, since its order of magnitude is unknown (we only know

that it is lower than

). Later, insigniﬁcant coefﬁcients in

a lower-tree are automatically propagated, since when the parent

of four sibling coefﬁcients has been set to 0, all the descendant

coefﬁcients are also assigned a value of 0, so that lower-trees are

recursively generated. However, if an isolated lower coefﬁcient

has been decoded, it must not be propagated as a lower-tree.

Hence, a different value must be assigned. For this case, we keep

this coefﬁcient as

until its 2 2 direct

descendants are scanned. At that moment, we can safely update

its value to 0 without risk of unwanted propagations, because no

more direct descendants of this coefﬁcient will be scanned.

C. Encoder Features

The proposed algorithm is resolution scalable due to the se-

lected scanning order and the nature of the wavelet transform.

This way, the ﬁrst subband that the decoder attains is the

,

which is a low-resolution scaled version of the original image.

Then, the decoder progressively receives the remaining sub-

bands, from lower frequency subbands to higher ones, which

are used as a complement to the low-resolution image to recur-

sively double its size, which is known as Mallat decomposition

[17]. Spatial and SNR scalability are closely related features.

Spatial resolution allows us to have different resolution images

of the same image. Through interpolation techniques, all these

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 16, NO. 11, NOVEMBER 2006 1441

Fig. 2. A 2-level wavelet transform of an 8 8 example image.

images could be resized to the original size, so that the larger

an image is, the closer to the original it gets. Therefore, this al-

gorithm could also be used for SNR scalability purposes. Other

features, like the deﬁnition of Regions Of Interest (ROI), can be

implemented in this coder by using a speciﬁc

param-

eter for those coefﬁcients in the trees representing that special

area.

As in most non-embedded encoders (like baseline JPEG,

H.264 intra mode, non-embedded SPIHT/SPECK, etc.), pre-

cise rate control is not feasible in the proposed algorithm

because it is usually achieved with bit-plane coding (like in

SPIHT) or iterative methods (like in JPEG 2000 or SFQ),

which signiﬁcantly increase the complexity of the encoder.

Nevertheless, rate control is still possible based on a fast anal-

ysis of the transform image features by extracting one or more

numerical parameters that are employed to determine the quan-

tization parameters needed to approximate a bit rate (see [15]

for details).

D. A Simple Example

Fig. 2 shows a small 8

8 image that has been transformed

using a 2-level DWT. In this section, we show the result of ap-

plying our tree-based encoder using this sample image. These

wavelet coefﬁcients have been scalar quantized, and the selected

parameter is 2. Regarding the maxplane parameter, it

can be easily computed as

.

When the coarser quantization is applied using

,

the values within the interval

are absolutely quan-

tized. Thus, all the signiﬁcant coefﬁcients can be represented by

using from 3 to 6 bits, and hence, the symbol set needed to repre-

sent the signiﬁcance map is

.

In this symbol set, special “

numeric symbols”

are marked with a superscript

,an symbol represents

an isolated lower coefﬁcient, and an

indicates a regular

lower symbol (the root of a lower-tree). Coefﬁcients be-

longing to a previously encoded lower-tree (those labeled as

) are not encoded and, in our

example, are represented with a star

. Fig. 3 shows the

symbols resulting from applying our algorithm to the example

image, and if we scan the subbands from the highest level (N)

to the lowest one (1), in 2

2 blocks, from left to right and top

to bottom, the resulting bitstream is the one illustrated in Fig. 4.

Note that the sign is not necessary for the LL subband since its

coefﬁcients are always positive.

Fig. 3. Symbols resulting from applying our algorithm to the example image.

IV. I

MPLEMENTATION

CONSIDERATIONS

Implementation details and further adjustments may improve

the performance of a compression algorithm. In this section, we

give a guide for a successful implementation of the LTW algo-

rithm. All the improvements introduced in this section should

preserve fast processing.

Context coding has been widely used to improve the R/D per-

formance in image compression. Although high-order context

modeling presents high complexity, simpler context coding can

be efﬁciently employed without noticeable increase in execution

time. We propose the use of two contexts based on the signiﬁ-

cance of the left and the upper coefﬁcients, thus if they both are

insigniﬁcant or close to insigniﬁcant, a different model is used

for coding. Adding a few more models to establish more signiﬁ-

cance levels (and thus more number of contexts) would improve

compression efﬁciency. However, it would slow down the algo-

rithm, mainly due to the context formation evaluation and the

higher memory usage.

Recall that

indicates the number of bits needed to

represent the highest coefﬁcient in the wavelet decomposition.

This value (along with

) determines the number of sym-

bols needed for the arithmetic encoder. However, coefﬁcients in

different subbands tend to be different in magnitude order, so

this parameter can be speciﬁcally set for every subband level. In

this manner, the arithmetic encoder is initialized exactly with the

number of symbols needed in every subband, which increases

the coding efﬁciency.

Related to the coarser quantization, consider the case in which

three coefﬁcients in a block are insigniﬁcant, and the fourth

value is very close to insigniﬁcant. In this case, we can consider

that the entire block is insigniﬁcant and all its coefﬁcients can

be labeled as

. The slight error in-

troduced is compensated by the saving in bit budget.

In general, each of these R/D improvements causes a slight

increase in PSNR (from 0.05 to 0.1 dB, depending on the source

image and the target bitrate).

V. C

OMPARISON WITH OTHER WAVELET CODERS USING

REAL IMPLEMENTATIONS

We have implemented the lower-tree wavelet (LTW) coding

and decoding algorithms in order to test their performance.

They have been implemented using standard C++ language

(without using assembly language or platform-dependant

features), and the simulation tests have been performed on a

regular Personal Computer (with a 500 MHz Pentium Celeron

1442 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 16, NO. 11, NOVEMBER 2006

Fig. 4. Example image encoded using the proposed tree-based algorithm.

TABLE I

PSNR (dB) W

ITH

DIFFERENT BIT RATES AND

CODERS

USING

LENA (512

512)

processor with 256 KB L2 cache), generating image ﬁles that

contain the compressed images, including the ﬁle headers

required for self-containing decompression. The reader can

easily perform new tests by using the LTW implementation

available at http://www.disca.upv.es/joliver/LTW.

In order to compare our algorithm with other wavelet en-

coders, we have selected the classical Lena and Barbara im-

ages (monochrome, 8 bpp, 512

512), the VisTex texture data-

base (monochrome, 8 bpp, 512

512) from the MIT media

lab, and the Café and Woman images (monochrome, 8 bpp,

2560

2048) from the JPEG 2000 test bed. The widely used

Lena image (from the USC) allows us to compare it with prac-

tically all the published algorithms, since results are commonly

expressed using this image. The VisTex set allows testing the

behavior of the coders when coding textures. Finally, the Café

and Woman images are less blurred and more complex than

Lena and represent pictures taken with a 5-Mpixel-high deﬁ-

nition digital camera.

Table I provides R/D performance when using the Lena

image. In general, we see that LTW achieves similar or better

results than the other coders, including the JPEG 2000 stan-

dard, whose results have been obtained using the reference

software Jasper [18], an ofﬁcial implementation included in

the ISO/IEC 15444-5 standard. For SPIHT, we have used the

implementation provided by the authors in the original paper.

For H.264 intra mode coding, results have been obtained with

the JM 9.6 reference software. For this coder, the PSNR result

for each bitrate has been interpolated from the nearest bitrates,

since the quantization granularity is very low, and an exact

bitrate cannot be achieved.

PSNR results for the rest of images are shown in Table II.

In this only SPIHT and Jasper have been compared to LTW,

since the compiled versions of the rest of coders have not been

TABLE II

PSNR (dB) W

ITH DIFFERENT

BIT

RATES AND

CODERS FOR

CAFÉ,W

OMAN,

B

ARBARA,

AND VisTex D

ATABASE

released, or results are not published for these images. We can

observe that R/D performance for Café is still higher using

our algorithm, although Jasper performs similar to LTW. For

Woman, we see that our algorithm exceeds in performance

both SPIHT and Jasper. In the case of the textures from the

VisTex database, we present the average PSNR in this Table,

which shows that, in general, LTW works better than JPEG

2000 and SPIHT for coding of textures. Only a few images

from the set (and only at high-bit rates) exhibit better coding

results for JPEG 2000 (details of the results for each image

in the set are available at http://www.disca.upv.es/joliver/csvt/

textures.pdf). However, at high bit rates, Jasper encodes high-

frequency images, like Barbara, better than LTW. Two reasons

may explain it. First, when high-frequency images are encoded

at high bit rates, many coefﬁcients in high-frequency subbands

are signiﬁcant, and hence our algorithm is not able to build

large lower-trees. Second, recall that JPEG 2000 includes many

contexts, and it causes higher performance for high-frequency

images. In our experiments, we have observed that if more

than two contexts are used in LTW, the R/D performance for

Barbara is close to the results shown with Jasper, but at the

cost of higher execution time.

On the other hand, the reader can perform a subjective evalu-

ation of the images Lena and Barbara encoded at 0.125 bpp with

SPIHT, JPEG 2000 and LTW by using Fig. 5, which shows a de-

tail of the face of Lena and another detail of Barbara’s checked

trousers. In the ﬁrst group of pictures, if we look carefully at

Lena’s lips, eyes and nose, we can observe that LTW offers

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 16, NO. 11, NOVEMBER 2006 1443

Fig. 5. Details of the Lena and Barbara images (at 0.125 bpp) for a subjective evaluation, using (a) SPIHT, (b) JPEG 2000, and (c) LTW.

TABLE III

E

XECUTION TIME COMPARISON FOR LENA AND CAFÉ (TIME IN MILLION OF

CPU CYCLES)

the best results, achieving better deﬁnition and sharpness than

SPIHT and especially than JPEG 2000. However, if we ana-

lyze a highly detailed image such as Barbara, and we focus on a

high-frequency area (e.g., the trousers in the Barbara image) we

can clearly see that JPEG 2000 gives the best results, and SPIHT,

which decompresses an image full of blurred areas, produces the

worst results. Both results are consistent with the PSNR results,

and conﬁrm that tree-based algorithms work better with mod-

erate to low detailed images, while JPEG 2000 is a better option

in detailed images, due to the reasons aforementioned.

The main advantage of the LTW algorithm is its lower com-

plexity. Table III shows that our algorithm greatly outperforms

SPIHT and Jasper in terms of execution time.

1

For medium sized

images (512

512), our encoder is from 3.25 to 11 times faster

than Jasper, whereas LTW decoder executes from 1.5 to 2.5

times faster than Jasper decoder, depending on the rate. In the

case of SPIHT, our encoder is from 2 to 2.5 times faster, and

the decoding process is from 1.7 to 2.1 times faster. With larger

images, like Café (2560

2048), the advantage is greater. LTW

encoder is from 5 to 16 times faster than Jasper and the decoder

is from 1.5 to 2.5 times faster. With respect to SPIHT, our al-

gorithm encodes Café from 1.7 to 3 times faster, and decodes it

from 1.5 to 2.5 times faster.

Note that in these tables we have only evaluated the coding

and decoding processes, and not the transform stage, since the

wavelet transform used is the same in all the cases; the popular

Daubechies 9/7 biorthogonal wavelet ﬁlter. Other wavelet trans-

forms, like Daubechies 23/25, have shown better compression

performance. However, this improvement is achieved with more

ﬁlter taps, and thus increasing the execution time of the DWT.

1

Measuring the complexity of algorithms is a hard issue. Execution time

of their implementation is largely dependant of the optimization level. This

way, there are commercial implementations of JPEG 2000 not included in the

ISO/IEC 15444-5 that are faster than Jasper, however they are usually imple-

mented by using platform dependant pieces of code (in assembly language) and

multimedia SIMD instructions. In our tests, SPIHT, JPEG 2000 and LTW im-

plementations are, as far as possible, written and compiled under the same con-

ditions, using plain C/C++ language and MS Visual C++6.0 for all them.

1444 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 16, NO. 11, NOVEMBER 2006

Other non-embedded encoders, like SFQ and H.264 intra

mode, are much slower than LTW. In particular, SFQ coding is

more than 10 times slower than LTW, while the H.264 encoder

(JM 9.6 implementation) is more than 50 times slower than our

proposal, due to the time-consuming predictive algorithm used

in this encoder.

Besides compression performance and complexity, a third

major issue usually considered in an image coder is the memory

usage. Our wavelet encoder and decoder are able to perform

in-place processing of the wavelet coefﬁcients, and thus they

do not need to handle additional lists or memory-consuming

structure. This way, only 21 Mbytes are needed to encode the

Café image using LTW (note that 20 Mbytes are needed to store

the image in memory using

integer type), whereas SPIHT and

Jasper require 42 and 64 Mbytes, respectively.

2

VI. C

ONCLUSION

In this paper, we have presented a new wavelet image en-

coder based on the construction and efﬁcient coding of wavelet

lower-trees (LTW). Its compression performance is within the

state-of-the-art, achieving similar results as other popular algo-

rithms (SPIHT is improved in 0.2–0.4 dB, and JPEG 2000 with

Lena in 0.35 dB as mean value).

However, the main contribution of this algorithm is its lower

complexity. Depending on the image size and bitrate, it is able

to encode an image up to 15 times faster than Jasper and three

times faster than SPIHT. Thus, it can be stated that the LTW

coder is one of the fastest efﬁcient image coders that can be

reported in the literature.

Therefore, due to its lower complexity, its high symmetry,

its simply design, and the lack of memory overhead, we think

that the LTW is a good candidate for real-time interactive multi-

media communications, allowing implementations both in hard-

ware and in software.

2

Results obtained with the Windows XP task manager, “peak memory usage”

column.

REFERENCES

[1] J. Oliver and M. P. Malumbres, “Fast and efﬁcient spatial scalable

image compression using wavelet lower trees,” in

Proc. IEEE Data

Compression Conf., Snowbird, UT, Mar. 2003, pp. 133–142.

[2] A. Gersho and R. M. Gray, Vector Quantization and Signal Compres-

sion. Norwell, MA: Kluwer, 1991.

[3] A. E. Jacquin, “Image coding based on a fractal theory of iterated con-

tractive image transformation,” IEEE Trans. Image Process., vol. 1, pp.

18–30, Jan. 1992.

[4] M. Antonini, M. Barlaud, P. Mathieu, and I. Daubechies, “Image

coding using wavelet transform,” IEEE Trans. Image Processing, vol.

1, no. 2, pp. 205–220, Feb. 1992.

[5] J. M. Shapiro, “Embedded image coding using zerotrees of wavelet

coefﬁcients,” IEEE Trans. Signal Process., vol. 41, no. 12, pp.

3445–3462, Dec. 1993.

[6] JPEG2000 Image Coding System, ISO/IEC 15444-1, 2000.

[7] C. Chrysaﬁs and A. Ortega, “Line-based, reduced memory, wavelet

image compression,” IEEE Trans. Image Process., vol. 9, pp. 378–389,

Mar. 2000.

[8] W. Sweldens, “The lifting scheme: A custom-design construction

of biorthogonal wavelets,” App. Comp. Harmon. Anal., vol. 3, pp.

186–200, 1996.

[9] A. Said and A. Pearlman, “A new, fast, and efﬁcient image codec based

on set partitioning in hierarchical trees,” IEEE Trans. Circuits Syst.

Video Technol., vol. 6, no. 3, pp. 243–250, Jun. 1996.

[10] Z. Xiong, K. Ramchandran, and M. Orchard, “Space-frequency quan-

tization for wavelet image coding,” IEEE Trans. Image Process., vol.

46, no. 5, pp. 677–693, May 1997.

[11] W. A. Pearlman, A. Islam, N. Nagaraj, and A. Said, “Efﬁcient, low-

complexity image coding with a set-partitioning embedded block

coder,” IEEE Trans. Circuits Syst. Video Technol., vol. 14, no. 11, pp.

1219–1235, Nov. 2004.

[12] C. Chrysaﬁs, A. Said, A. Drukarev, A. Islam, and W. A. Pearlman,

“SBHP—a low complexity wavelet coder,” in Proc. IEEE Int. Conf.

Acoust., Speech, Signal Process., 2000, pp. 2035–2038.

[13] D. Taubman, “High performance scalable image compression with

EBCOT,” IEEE Trans. Image Process., vol. 9, no. 7, pp. 1158–1170,

Jul. 2000.

[14] W. A. Pearlman, “Trends of tree-based, set partitioning compression

techniques in still and moving image systems,” in Proc. Picture Coding

Symp., Apr. 2001, pp. 1–8.

[15] O. Lopez, M. Martinez-Rach, J. Oliver, and M. P. Malumbres, “A

heuristic bit rate control for non-embedded wavelet image coders,” in

Proc. Int. Symp. ELMAR Focused Multimedia Signal Process., Jun.

2006, pp. 13–16.

[16] A. Al, B. P. Rao, S. S. Kudva, S. Babu, D. Sumam, and A. V.

Rao, “Quality and complexity comparison of H.264 intra mode with

JPEG2000 and JPEG,” in Proc. IEEE ICIP, 2004, pp. 525–528.

[17] S. Mallat, “A theory for multiresolution signal decomposition,” IEEE

Trans. Pattern Anal. Mach. Intell., vol. 11, no. 7, pp. 669–718, Jul. 1989.

[18] M. Adams, Jasper Software Reference Manual (v1.6). Oct. 2002,

ISO/IEC JTC 1/SC 29/WG 1 N 2415.