Page 1

SPATIALLY CONTINUOUS ORIENTATION ADAPTIVE DISCRETE PACKET WAVELET

DECOMPOSITION FOR IMAGE COMPRESSION

Nagita Mehrseresht and David Taubman

The University of New South Wales, Sydney, Australia

ABSTRACT

In this paper, we propose an orientation adaptive discrete wavelet

transform (DWT) with perfect reconstruction. The proposed trans-

form utilizes the lifting structure to effectively orient the 2D-DWT

bases in the direction of local image features. A shifting operator

is employed within each lifting step to align spatial geometric fea-

tures along the vertical or horizontal directions. The proposed ori-

ented transform generates a scalable representation for the image

and the orientation information. To approximate the asymptotically

optimal rate-distortion performance of a piecewise regular function

more closely, we adopt a packet wavelet decomposition. The exper-

imental results obtained by implementing the proposed transform

in a JPEG2000 codec illustrate superior compression performance

for the oriented transform with more than 2?5 dB improvement for

highly oriented natural images. More importantly, even at the same

PSNR, the proposed scheme reduces the visual appearance of the

Gibbs-like artifacts significantly, considerably improving the visual

quality of the reconstructed image.

1. INTRODUCTION

The wavelet transform has become a popular tool for image and

video compression as it generates a sparse and multiscale represen-

tation of the image. Using the discrete wavelet transform (DWT) to-

getherwithembeddedcodingmethods, resolutionandquality(SNR)

scalabilitycanbereadilyachievedwithgoodcompressionefficiency.

Moreover, unlike block based transform, the DWT does not suffer

from boundary artifacts in the presence of quantization noise.

Conventional two dimensional (2D) subband transforms are

formed by applying a one dimensional (1D) transform separately

along the horizontal and vertical directions. While wavelets are

adept in dealing with point-based singularities, they do not neces-

sarily provide a compact representation of edges and higher order

singularities. Natural images, on the other hand, commonly contain

regions with geometric regularities which can be approximated as

linear edges on a local level. These local edges are generally neither

vertical nor horizontal and result in considerable energy in highpass

subbands. In addition, at low bit-rates, quantization effects appear as

Gibbs-like artifacts at such edges. This jagged and ringing appear-

ance of the edges is visually disturbing.

Significant efforts have been invested in the past to adapt the

transform basis to the geometrical regularity of the image. Taubman

and Zakhor [1] proposed a scheme in which the image is first re-

sampled before the subband transform. The invertible re-sampling

process is performed on a block by block basis with the aim of

aligning local image edges with the vertical or horizontal directions.

A de-blocking function is also applied at block boundaries to re-

duce the appearance of blocking artifacts in the reconstructed im-

age. Wang et al. [2] later used similar ideas, proposing overlapped

extensions to prevent boundary artifacts. Bandelet [3] also aims to

capture the geometrical image regularity, but it fails to provide a

multiscale representation of the image. Second generation discrete

bandelets, proposed by Peyre and Mallat [4] generate a multiscale

representation of the image by applying a geometrically adaptive

bandelet filter to conventional wavelet coefficients. The proposed

bandelet filter partitions each resolution intoblocks and applies extra

decomposition stages along the image singularities. Ding et al. [5]

proposed a scheme to incorporate directional spatial prediction into

the conventional lifting-based wavelet transforms. Their proposed

scheme, however, does not treat predominantly vertical or horizontal

edges similarly. If the vertical DWT is performed before horizontal

decomposition, the proposed scheme fails to properly exploit geo-

metric regularities of the image along the predominantly horizontal

edges. This is because, despite the high amount of energy in the

vertically highpass subband, directional spatial prediction is applied

only within the vertically lowpass subband, without using highpass

information.

For video compression, Ohm [6] and Taubman and Zakhor [7]

proposed the idea of aligning geometrical features between frames

through motion compensation, before applying a temporal trans-

form. Secker and Taubman [8] and Pesquet-Popescu and Bottreau

[9] later proposed a framework for motion compensated temporal

transformation, using a lifting realization of the DWT where motion

compensation is applied within the lifting steps. To avoid blocking

artifacts, inband motion compensation has also been explored for

wavelet based video compression.

In this paper we employ the idea of lifting-based spatially adap-

tive DWT for image compression. The proposed scheme uses the

lifting structure to effectively orient the 2D DWT basis functions in

the direction of local image features. Motivated by work on motion

compensated lifting for video, we employ a shifting operation within

each lifting step. Doing so the DWT is effectively applied along the

desired direction. For thefirst stage of DWT (e.g., vertical decompo-

sition) shifting can be directly applied to the baseband image signal.

For the second stage (horizontal decomposition) we use an inband

shifting approach.

Importantly, within a region with constant orientation, the pro-

posed scheme is essentially the same as applying the conventional

DWT to a skewed version of the original image. At the boundaries

between regions with different orientations, however, transitions are

performed in the subband domain, allowing the synthesis wavelet

kernel to smooth out possible boundary artifacts.

To generate wavelet basis which approximates the asymptotically

optimal rate-distortion performance of a piece-wise regular function

more closely, we adopt a packet wavelet decomposition so as to fur-

ther decompose the subbands with high energy along the direction

of geometric flow.

Details of the proposed oriented transform and the desired packet

decomposition are given in the Sections 2 and 3. Section 4 describes

1593 1424404819/06/$20.00 ©2006 IEEE ICIP 2006

Page 2

?1

?2

?3

?4

m

n

Fig. 1. An illustration of the sample shifts required to align the ver-

tical DWT basis to the orientation of an edge.

thetechniquethatweusetoeffectivelyestimateandcodetheorienta-

tion information. Section 5 experimentally investigates the superior

performance of the proposed orientation adaptive wavelet transform.

Concluding remarks are given in Section 6.

2. ORIENTATION ADAPTIVE DWT

Without loss of generality, we assume that the 2D DWT is imple-

mented by vertical, followed by horizontal decomposition. We use

the notation x?to represent the vector containing the ?throw. The

conventional vertical 5?3 DWT can be implemented by the follow-

ing lifting steps,

h?

?

=

x2?+1?1

x2?+1

2(x2?+ x2?+2)

4(h?

l?

?

=

??1+ h?

?)

where h?

pass vertical subbands, respectively.

Fig. 1 illustrates a line with close to vertical orientation. Using

the conventional vertical DWT generates significant activity in the

highpass subband as the transform basis is not aligned to the ori-

entation of the underlying geometric feature. Motivated by motion

compensated lifting, we employ a shifting operator within each lift-

ing step to align features in the vertical direction. We use the no-

tation W???(x?) for the shifting operator which aligns features in

x?with those in x?. The proposed oriented lifting steps can then be

expressed as

?and l?

?correspond to the ?throw of the high and low

h?

?= x2?+1?1

2(W2??2?+1(x2?)+W2?+2?2?+1(x2?+2))

(1)

l?

?= x2?+1

4(W2??1?2?(h?

??1)+W2?+1?2?(h?

?)). (2)

So long as the shifting operator W??? only uses information from

x?, the overall transform can be trivially inverted regardless of the

actual implementation of W; in this paper we use a windowed Cubic

Spline interpolator with1

8

and kernels can also be used within the proposed oriented DWT.

An ad-hoc extension of the proposed scheme for near to horizon-

tal directions is to follow the equations (1) and (2) and replace x?

with a vector y?which contains the vertical subband coefficients at

column ?. However, we cannot apply a baseband shifting operator,

thpixel accuracy. For sure, other precision

W, on the vertical subband samples. Applying a linear phase shift to

highpass subband coefficients does not shift the corresponding base-

band signal. The baseband shifting function W, thus, needs to be re-

placed by an inband shifting operator. One solution could be to syn-

thesize each highpass subband (perhaps by choosing the other sub-

band to be zero) and shift the baseband signal, followed by subband

decomposition to the desired subband; however, as shown in [10],

due to unavoidable frequency aliasing in subband transforms, shift-

ing a signal, generally causes frequency leakage between subbands.

This information leakage occurs where shifting both the low and the

high pass subbands. Consequently, using the above-mentioned ad-

hoc inband shifting operator, we cannot fully adapt the horizontal

transform to the desired orientation.

To efficiently align the DWT basis along both near-to-horizontal

and near-to-vertical orientations, the inband shifting operator must

essentially have same performance as shifting the baseband signal.

In particular, the order by which we implement each of the hori-

zontal and vertical decomposition must have no effect on the overall

oriented 2D transform.

We employ an inband shifting operator which exploits all the

available information, i.e., both of the high and low pass subbands.

[11] proposes a similar shifting operator for inband motion compen-

sated temporal decomposition. The inband shifting kernel,?

and is a composition of the subband synthesis filter with a baseband

shifting kernel and the subband analysis filter. Using y?to represent

the vector containing the interleaved vertically low and high pass

coefficients at column ?,?

features at column ?. Using these notation, the oriented horizontal

decomposition can be implemented as

??

l?

4

Similarly, the horizontal decomposition can trivially be inverted by

reversing the order of lifting steps and replacing the summation with

subtraction.

W???,

uses vertically high and low pass subband coefficients at column ??

W???(y?) is essentially the same as shift-

ing the baseband signal at column ? and aligning it with geometrical

h?

?= y2?+1?1

2

W2??2?+1(y2?) +?

??

W2?+2?2?+1(y2?+2)

?

(3)

?= y2?+1

W2??1?2?(h?

??1) +?

W2?+1?2?(h?

?)

?

(4)

3. IMPORTANCE OF PACKET DECOMPOSITION

The orientation adaptive transform aims to align geometrical fea-

tures along vertical and horizontal directions. Even a perfectly ver-

tical (horizontal) edge, however, generates significant activity in the

horizontally (vertically) highpass subband, LH (HL). To generate a

wavelet basis which approximates the asymptotically optimal rate-

distortionperformanceofapiece-wiseregularfunctionmoreclosely,

we use a packet wavelet decomposition and further decompose the

subbands with high energy along the local edges. Theoretically, by

further decomposing the LH or HL subband one must be able to rep-

resent an ideal edge with a reduced number of non-zero coefficients.

The HH subband corresponds to diagonal features. The orienta-

tion adaptive transform reduces the energy in the HH subband. De-

pending on the orientation of the edge, however, the oriented trans-

form localizes the energy in either the HL or the LH subband. We

applyanadditionalleveloforientedhorizontalandverticalDWTde-

compositiontotheHLand LHsubbands, respectively. While, gener-

ally, only one of these subbands has high energy, we persistently ap-

ply the packet decomposition to both subbands; this avoids disconti-

nuity in the wavelet transform at the transition between regions with

1594

Page 3

4.73 4.73 Oriented with

borrowing borrowing

166.50166.50 Oriented without

borrowingborrowing

422.16422.16 Non oriented

decompositiondecomposition

Mean square Mean square

Energy of LHH

Energy of LHH

LHH LHH

Oriented with

Oriented without

Non oriented

4.59 4.59Oriented with

borrowing borrowing

165.90 165.90Oriented without

borrowing borrowing

423.07423.07 Non oriented

decompositiondecomposition

Mean square Mean square

Energy of HLH

Energy of HLH

HLH HLH

Oriented with

Oriented without

Non oriented

Packet

Packet

Decomp

Decomp. .

LHH

LHH

HL

HH

LL

L

H

Mallat

Mallat

Decomp

Decomp. .

H H

L L

H H

HLH

HLH

Fig. 2.

wavelet decomposition with and without information borrowing.

The dotted arrows illustrate the source of lowpass information bor-

rowed for oriented packet decomposition.

Mean square highpass energy after one level of packet

predominantly horizontal and vertical orientation. We can, there-

fore, utilize the same packet subband coder as in JPEG2000-part II

standard to scalably code the subband coefficients generated by the

proposed oriented transform.

Theoriented packetdecomposition canbeimplemented similar to

equations (1)-(2) or (3)-(4). Importantly, inband shifting is required

for both the vertical and the horizontal decomposition as the trans-

form is being applied on the primary highpass subbands. We should,

however, note that for oriented packet decomposition, the inband

shifting operation borrows corresponding lowpass information from

the LL subband of the primary Mallat decomposition. In this paper,

for the packet wavelet decomposition, we use the same orientation

information as the primary Mallat decomposition; to compensate for

subsampling, however, the shift values (and correspondingly the ori-

entation information) are scaled by a factor of 2.

Fig. 2 presents an experiment illustrating the importance of the

information borrowing for inband shifting. Without borrowing low-

pass information, we cannot fully adapt the transform to the spatial

features in the HL or LH subband. This experiment also confirms

the analogous performance of the oriented transform in adapting to

near to vertical or near to horizontal orientations.

4. ORIENTATION ESTIMATION AND CODING

Most of the currently well developed schemes for local orientation

estimation are based on the gradient estimation. The compression

gain, however, highly depends on the energy in the highpass sub-

bands. In this paper we develop an ad-hoc scheme to estimate the

local orientations . The proposed scheme aims to minimize the total

highpass energy and compacting it into a reduced number of sam-

ples.

The orientation estimation scheme compares the total highpass

energywhendifferentorientationsareusedwithintheorientedtrans-

form of Section 2. At each spatial location, the direction of geomet-

ric flow is estimated by choosing the orientation which generates the

smallest total highpass energy. To smooth out the effect of noise and

avoid fluctuation in the estimated orientation, we measure the local

highpass energy for each 4 × 4 block of the baseband image. We

25

27

29

31

33

35

37

0.20.3 0.40.6

0.9

1.2

bpp

Conventional Mallat Conventional Mallat

Oriented Mallat Oriented Mallat

Conventional PW Conventional PW

Oriented PW Oriented PW

PSNR

Fig. 3. The PSNR(dB) of scalably reconstructed Barbara using the

conventional and the oriented DWT, with and without the packet de-

composition (PW).

found that using a weighted averaging function on block energies is

also beneficial for generating a more coherent orientation map of the

image.

To efficiently code the orientation information, we use multiscale

quad-tree coding with Lagrangian based pruning. At each level, the

pruning algorithm measures the cost of coding the updated orienta-

tion values and the corresponding reduction in the highpass energy

and decides whether or not it is feasible to merge the blocks.

In this paper, we generate separate orientation field for each spa-

tial resolutions. Four levels of pruning are used and the total cost

of sending the orientation information is usually as low as 0?02 bpp;

therefore, further exploiting the correlation between orientation val-

ues at different resolution levels does not seem to be advantageous.

5. EXPERIMENTAL RESULTS

In this Section, we experimentally investigate the compression ef-

ficiency and the reconstructed visual quality of the proposed orien-

tation adaptive wavelet transform. We adopt the proposed oriented

transform into a JPEG2000 codec which supports the part-II arbi-

trary decomposition styles.

Fig 3 illustrates the PSNR of the reconstructed standard test im-

age “Barbara” (gray scale, 512×512) when scalably reconstructed

at different bit-rates using 5 levels of 5?3 DWT decomposition. We

use four different wavelet transforms in this experiment; the conven-

tional transform uses Mallat decomposition for all levels and is the

same as compressing using the JPEG2000 codec with 5?3 kernels;

the conventional packet wavelet employs one level of non-oriented

packet decomposition on the LH and HL subbands at the two finest

resolution levels; a JPEG2000 codec which supports the part-II fea-

tures can be directly used to provide such a decomposition structure;

the oriented case replaces the conventional 5?3 DWT with the pro-

posed oriented transform in the two finest resolution level; finally,

the oriented packet wavelet scheme employs the proposed oriented

transform as well as the oriented packet decomposition on the HL

and LH subbands in the two finest resolution levels. Fig. 4 indicates

the results of a similar experiment using a 512 × 512 block of the

standard test image “Bike” (gray-scale,2048 × 2560).

As shown in Figs. 3 and 4, the oriented transform outperforms

1595

Page 4

21

23

25

27

29

31

33

35

37

0.20.3 0.40.60.91.2

bpp

PSNR

Conventional Mallat Conventional Mallat

Oriented Mallat Oriented Mallat

Conventional PWConventional PW

Oriented PWOriented PW

Fig. 4. PSNR results using subband transforms as in Fig. 3.

Fig. 5. Visual comparison at a same PSNR. left: Conventional DWT.

Right: Oriented DWT

the conventional DWT by more than 1 dB in “Barbara” and 2?5 dB

in the highly oriented part of the “Bike” image. Using the prun-

ing algorithm of Section 4 reduces the cost of sending orientation

information to 0?02 and 0?04 bpp for Figs. 3 and 4, respectively.

Simulation results obtainedusingother testimages alsorevealasim-

ilar behavior. The oriented transform outperforms the conventional

wavelet decomposition by up to 0?8 dB with full resolution “Bike”,

0?5 dB with “Lena” (512 × 512) and 0?8 dB with “Cameraman”

(256 × 256) standard test images. Due to unavoidable frequency

aliasing in subband transforms, employing the proposed oriented

transform in coarser resolution levels (i.e., more than 2) reveals only

a minor improvement.

Our experimental investigations show that, the effective suppres-

sion of Gibbs-like artifacts for diagonal edges provides a further in-

centive for using the proposed orientation adaptive transform. Fig.

5 compares the performance of the conventional JPEG2000 codec

(left) and the proposed oriented transform (right) when reconstruct-

ing at a same PSNR; the reconstructed image using the oriented

transform has substantially superior visual quality even though the

PSNR value is the same.

6. CONCLUSIONS

The proposed orientation adaptive transform generates a perfectly

reconstructable scalable representation of the images. The proposed

scheme utilizes the inband shifting technique and the lifting real-

ization of subband transforms to spatially adapt each lifting step to

the direction of local image edges. Packet decomposition is also

used to further decompose the LH and HL subbands and is shown

to be more effective with the orientation adaptive transforms. The

proposed scheme can be used with any wavelet kernel with lifting

realization; however, the inband shifting operator employed dur-

ing packet wavelet decomposition is more effective when used with

wavelet kernels with single predict and update steps. In this pa-

per, we have only reported results for 5?3 DWT. We, however, have

adopted the proposed orientation adaptive transform also with 9?7

kernels. While the experimental results with Mallat decomposition

are similarly promising, the oriented packet wavelet decomposition

does not yield significant improvement, when used with 9?7 DWT.

7. REFERENCES

[1] D. Taubman and A. Zakhor, “Orientation adaptive subband

coding of images,” IEEE Trans. Image Proc., vol. 3, no. 4, pp.

421–437, July 1994.

[2] D. Wang, L. Zhang, and A. Vincent, “Curved wavelet trans-

form and overlapped extension for image coding,” IEEE Int.

Conf. Image Proc., vol. 2, pp. 1273–1276, Oct. 2004.

[3] E. L. Pennec and S. Mallat, “Geometrical image compression

with bandelets,” SPIE Int. Symp. Visual Comm. and Image

Proc., pp. 1273–1286, 2003.

[4] ——, “Discrete bandelets with geometric orthogonal filters,”

IEEE Int. Cong. Image Proc., pp. 65–68, 2005.

[5] W. Ding, F. Wu, and S. Li, “Lifting-based wavelet transform

with directionally spatial prediction,” Picture Coding Sympo-

sium, 2004.

[6] J. R. Ohm, “Three-dimensional subband coding with motion

compensation,” IEEE Trans. Image Proc., vol. 3, no. 5, pp.

559–571, 1994.

[7] D. Taubman and A. Zakhor, “Multi-rate 3-d subband coding of

video,” IEEE Trans. Image Proc, vol. 3, no. 5, pp. 572–588,

1994.

[8] A. Secker and D. Taubman, “Motion-compensated highly scal-

ablevideocompressionusinganadaptive3dwavelettransform

based on lifting,” IEEE Int. Conf. Image Proc., pp. 1029–1032,

2001.

[9] B. Pesquet-Popescu and V. Bottreau, “Three dimensional lift-

ing schemes for motion compensated video compression,”

IEEE Int. Conf. Accoust. Speech and Signal Proc., pp. 1793–

1796, 2001.

[10] N. Mehrseresht and D. Taubman, “A flexible structure for fully

scalable motion compensated 3d-DWT with emphasis on the

impact of spatial scalability,” IEEE Trans. Image Proc., (to ap-

pear) Macrh 2006.

[11] D. Taubman, R. Mathew, and N. Mehrseresht, “Fully scal-

able video compression with sample-adaptive lifting and over-

lapped block motion,” SPIE Symp. Image and Video Comm.

and Proc., 2005.

1596