ArticlePDF Available

TIFS: A hybrid scheme integrating partitioned iterated function system and linear transforms

Authors:

Abstract and Figures

Since its introduction, fractal theory has been exploited in many research topics such as image processing, pattern recognition and image coding, because of the intrinsic properties which make fractals highly suited to tasks such as segmentation, feature extraction and indexing, just to name a few. Unfortunately, they are based on a strong asymmetric scheme, consequently suffering from very high coding times. On the other hand, linear transfoms are quite time balanced, allowing them to be usefully exploited in real-time applications, but they do not provide comparable performance with respect to the image quality for high bit-rates. Here, different levels of embedding linear transforms in the fractal coding scheme are investigated. Experimental results have been organised to point out the contribution of each embedding step to the objective quality of the decoded image.
Content may be subject to copyright.
1
TIFS: A Hybrid Scheme Integrating PIFS
and Linear Transforms
Michele Nappi, Daniel Riccio
Dipartimento di Matematica e Informatica
Universit
`
a di Salerno,
84084 Fisciano (SA), Italy
{mnappi,driccio}@unisa.it
AbstractMany desirable properties make fractals a powerful
mathematic model applyied in several image processing and
pattern recognition tasks: image coding, segmentation, feature
extraction and indexing, just to cite some of them. Unfortunately,
they are based on a strong asymmetric scheme, so suffering from
very high coding times. On the other side, linear transfoms
are quite time balanced, allowing to be usefully integrated
in real-time applications, but they do not provide comparable
performances with respect to the image quality for high bit
rates. Owning to their potential for preserving the original
image energy in a few coefficients in the frequency domain,
linear transforms also known a widespread diffusion in some
side applications such as to select representative features or to
define new image quality measures. In this paper, we investigate
different levels of embedding linear trasforms in the fractal
coding scheme. Experimental results have been organized as to
point out what is the contribution of each embedding step to the
objective quality of the decoded image.
I. INTRODUCTION
The literature about the fractal image compression uninter-
ruptedly grown-up starting from the preliminar definition of
Partitioned Iterated Function System (PIFS) due to Jacquin in
1989; most of the interest in fractal coding is due to its side
applications in fields such as image database indexing [6] and
face recognition [14]. These applications both utilize some sort
of coding, and they can reach a good discriminating power
even in the absence of high PSNR from the coding module.
The majority of works on the image fractal compression sets
the speed-up of the coding process, while preserving desirable
properties of the fractal coding such as high compression rate,
fast decoding and scale invariance, as its main goal. Many
different solutions have been proposed to speed up the coding
phase [3,5,10–12,19], for instance modifying the partitioning
process or providing new classification criteria or heuristic
methods for the range/domain matching problem. All these ap-
proaches can be grouped in three classes: classification meth-
ods, feature vectors [22] and local search. Generally, speed-up
methods based on nearest neighbour search by feature vectors
outperform all the others in therms of decoded image quality
at a comparable compression rate, but they often suffer from
the high dimensionality of the feature vector; the Saupe’s
operator represents a suitable example.To cope with this,
dimension reduction technique are introduced. Saupe reduced
the dimension of the feature vector by averaging pixels, while
in [24] DCT is used to cut-out redundant information. In the
same way, also linear transforms have been widely exploited to
extract representative features or to codify groups of pixels in
image indexing and compression applications. Indeed, Linear
transforms form the basis of many compression systems as
they de-correlate the image data and provide good energy
compaction. For example, the Discrete Fourier Transform
(DFT) [25] is used in many image processing systems, while
Discrete Cosine Transform (DCT) [25] is used in standards
like JPEG, MPEG and H.261. Still others are Walsh-Hadamard
transforms (WHT) [25] and Haar Transforms (HT) [25]. In
particular, linear transforms have been matter of study also
in the field of objective quality measurs definition. The HVS,
based on some preliminary DCT filtering is just an exam-
ple [17], but also magnitude and phase of the DFT coefficients
have been used to define new objective quality measures [1].
This is motivated by that standard objective measures such as
the Root Mean Square Error (RMSE) and Picked Signal To
Noise Ratio (PSNR) are very far from the human perception
in some cases. Hence, this paper sets as its main goal to
investigate the ways of embeddeding a generic linear transform
T into the standard PIFS coding scheme. In more details,
at first linear properties of T are exploited to dramaticaly
reduce computational costs of the coding phase, by arranging
its coefficients in a suitable way. Subsequently, the RMSE,
commonly used to upperbound the collage error, is repleaced
by a new objective distance measure based on T coefficients.
Then, T coefficients corresponding to high frequencies are
used to compensate the information lost by the fractal scheme
in high compressed regions.
II. THEORETICAL CONCEPTS
Inorder to shed light on further discussions about the hybrid
scheme proposed in the paper, it may be sound to draw
the reader’s attention to some basic concepts about fractal
compression and linear transforms.
A. Partitioned Iterated Function Systems
PIFS consist in a set of local affine contractive transfor-
mations, which exploite the image self-similarities to cut-out
redundancies, while extracting salient features. In more details,
given an input image I, it is partitioned into a set R =
©
r
1
, r
2
, . . . , r
|R|
ª
of disjoint square regions of size |r| × |r|,
named ranges. Another set D =
©
d
1
, d
2
, . . . , d
|D|
ª
of larger
2
regions is extracted from the same image I. These regions are
called domains and can overlap. Their size is |d| × |d|, where
usually |d| = 2 |r|. Since a domain is quadruple sized respect
to a range, it must be shrunk by a 2 × 2 average operation on
its pixels. This is done only once, downsampling the original
image and obtaining a new image that is a quarter of the
original. An overall representation of the PIFS compression
scheme is reported in Fig. 1.
The image I is encoded range by range: for each range r, it
is necessary to find a domain d and two real numbers α and β
such that
min
dD
½
min
α,β
kr (αd + β)k
2
¾
. (1)
Doing so minimizes the quadratic error with respect to the
Euclidean norm. It is customary to impose that |α| 1 in
order to ensure convergence in the decoding phase. The inner
minimum on α and β is immediate to compute by solving a
minimum square error problem, obtaining
α =
X
1i≤|r|
X
1j≤|r|
(r
i,j
r)(d
i,j
d)
X
1i≤|d|
X
1j≤|d|
(d
i,j
d)
2
(2)
β =
r αd, (3)
where |r| and |d| are the dimension of ranges and shrunken
domains, while
r and d are the mean value of the range r and
the domain d, respectively. The outer minimum on d, however,
requires an exhaustive search over the whole set D, which is
an impractical operation. Therefore, ranges and domains are
classified by means of a feature vectors in order to reduce
the cost of searching the domain pool: if the range r is being
encoded, only the domains having a feature vector close to
that of r are considered.
Input image
KD-Tree
Classification
segmentation
Domains
Ranges
Coding
KD-Tree
List of candidate domains
Range
D
o
m
a
i
n
se
a
r
c
h
Error
estimation
RMSE
4.5
6.2
2.1
Best
domain
Fig. 1. The architecture of our fractal coder.
B. Linear Transforms
A Linear Transform T is called linear if it has two mathe-
matical properties:
T (x + y) = T (x) + T (y) additivity
T (αx) = αT (x) homogeneity
A third property, shift invariance, is not a strict require-
ment for linearity, but it is a mandatory property for most
image processing techniques. These three properties form the
mathematics of how linear transformation theory is defined
and used. Homogeneity and additivity play a critical role in
linearity, while shift invariance is something on the side. This
is because linearity is a very broad concept, encompassing
much more than just signals and systems. In other words,
when there are no signals involved, shift invariance has no
meaning, so it can be thought of as an additional aspect of
linearity needed when signals and systems are involved. The
linear transform domain features are very effective when the
patterns are characterized by their spectral properties; so, in
this paper, the feature extraction capability of the Discrete
Fourier Transform (DFT), the Discrete Cosine Transform and
the Haar transform (HT) are investigated. They are formally
defined as follows:
DFT: v(k) =
N1
X
n=0
x[n]e
2πikn
N
, with k = 0, 1, . . . , N
DCT: v(k) = α(k)
n1
X
n=0
x[n]cos
µ
nkπ
N 1
, with
k = 0, 1, . . . , N,
α(0) =
p
1/N and
α(k) =
p
2/N, k 6= 0
HT:
(
v(0) =
R
1
0
x(n)H
0
(n)dn
v(j, k) =
R
1
0
x(n)H
jk
(n)dn
, with
H
jk
(n) = 2
j/2
H(2
j
n k), j 0, k = 0, 1, . . . , 2
j
1
H(0) = 1
[0,1)
characteristic function on [0, 1)
H(n) =
½
1 0 < n < 1/2
1 1/2 n < 1
III. IMPROVING QUALITY
A major problem in evaluating lossy techniques is the
extreme difficulty in describing the type and amount of
degradation in reconstructed images. Because of the inherent
drawbacks associated with the subjective measures of image
quality, there has been a great deal of interest in developing
quantitative measures that can consistently be used as sub-
stitute. All these measures have been largely used to assess
the quality of the whole image after a coding process has
been applied on; in other words the original image is com-
pressed/decompressed by means of an encoder and than the
overall amount of distortion introduced by the coding scheme
is measured. Thus, objective measures have represented an
effective way to compare different coding schemes in terms
of percentage of distortion introduced for a fixed compression
ratio. The key idea is then to embedd quality measures into the
coding process, not curbing them to be a sheer analysis tool.
The compression scheme we adopted for this study, which is
represented in Fig. 1, lays itself open to a direct replacement
of the RMSE by other quality measures.
3
A. New quality measures
Many objective quality measures [1,7], have been defined
to replace subjective evaluations by retaining, as much as
possible, the fidelity with the human perception of image
distortions introduced by the coding schemes. The most com-
mon measures are undoubtedly the RMSE (Root Mean Square
Error) and the PSNR (Peak Signal to Noise Ratio) [1]. They
owe their wide spread to that they work well on the average
by showing a very low computational cost. However, there
are cases in which the quality estimates given by the PSNR
are very far from the human perception (see Fig. 2) and this
led many researchers to define new quality metrics providing
better performances in terms of distortion measurement even
if at a higher computational cost. Some examples are given by
the Human Visual System [17] or the FFT Magnitude Phase
Norm [1].
(a) (b)
Fig. 2. Two picture with the same objective quality (PSNR 26.5 dB), but
very different subjective quality.
Measures in Table I are defined in the spatial domain and
they are all discrete and bivariate.
1
TABLE I
QUALITY MEASURES
Name Definition
Average Difference AD =
1
n
2
n1
X
j=0
n1
X
k=0
¯
¯
¯
R(j, k)
ˆ
R(j, k)
¯
¯
¯
Correlation Quality CQ =
P
n1
j=0
P
n1
k=0
¯
¯
¯
R(j,k)
2
ˆ
R(j,k)R(j,k)
¯
¯
¯
P
n1
j=0
P
n1
k=0
R(j,k)
Image Fidelity IF =
P
n1
j=0
P
n1
k=0
[
R(j,k)
ˆ
R(j,k)
]
2
P
n1
j=0
P
n1
k=0
R(j,k)
2
Maximum Difference MD = max
n
¯
¯
¯
R(j, k)
ˆ
R(j, k)
¯
¯
¯
o
N Cross Correlation NK =
¯
¯
¯
¯
¯
1
P
n1
j=0
P
n1
k=0
[
R(j,k)
ˆ
R(j,k)
]
P
n1
j=0
P
n1
k=0
R(j,k)
2
¯
¯
¯
¯
¯
Structural Content SC =
¯
¯
¯
¯
¯
1
P
n1
j=0
P
n1
k=0
R(j,k)
2
P
n1
j=0
P
n1
k=0
ˆ
R(j,k)
2
¯
¯
¯
¯
¯
RMSE RMSE =
1
n
2
n1
X
j=0
n1
X
k=0
h
R(j, k)
ˆ
R(j, k)
i
2
PMSE P M SE =
P
n1
j=0
P
n1
k=0
[
R(j,k)
ˆ
R(j,k)
]
2
n
2
max
j,k
[R]
2
Osaka Plots formally defined in [17]
On the contrary, the most significant examples of image
quality measures defined in the frequency domain are the HVS
and the DFT magnitude/phase norm.
1
R(j, k) and
ˆ
R(j, k) denote the samples of original and approximated
range block.
Human Visual System Norm (HVS): few models of the
HVS have been developed in literature; in [17] dealing with
the Discrete Cosine Transform, Nill has defined his function
for the model as a band-pass filter with a transfer function in
polar coordinates. Therefore the image quality is calculated
on pictures processed through such a spectral mask and then
inverse discrete cosine transformed.
FFT Magnitude Phase Norm (FFT-MP): A spectral
distance-based measures is the Fourier magnitude and/or
phase spectral discrepancy on a block basis [1]. In general,
while the mean square error is among the best measures for
additive noise, local phase-magnitude measures are more
suitable for coding and blur artifacts. In particular, the FFT
magnitude/phase norm is most sensitive to distortion artifacts,
but at the same time least sensitive to the typology of images.
Both these measures have drawbacks. The HVS is to much
complex to be profitably used in several applications, while
the FFT based distance has two main limitations: a) the phase
is significantly smaller than the magnitude and its contribution
to the overall distance value is made even more negligible by
a very small factor λ, b) the n-norm and the arctan, needed to
compute magnitude and phase, are computationally intensive
to be calculate, in particular for complex coefficients.
Hence it appears that fractal image coding can significantly
profit of a simpler image quality measure exploiting properties
of linear transforms. In more details, we can define such
distans as follows.
Siano Γ(u, v) e
ˆ
Γ(u, v) i coefficienti della trasfor-
mata T applicata ad un blocco dell’immagine origi-
nale e dell’immagine compressa rispettivamente. Nella
definizione della misura di qualit necessario distinguere
due casi, a seconda che i coefficienti della trasformata
abbiano solo componenti reali o presentino anche com-
ponenti immaginarie. Nel secondo caso, infatti, bisogna
ricondurre la coppia di componenti ad un unico valore. In
letteratura si soliti ottenere questo risultato, considerando,
in alternativa alla coppia di valori, il valore assoluto (o
magnitudo) del coefficiente complesso. In questo caso, si
scelta una via differente, definendo un operatore Ψ che
permetta di gestire entrambi i casi. Nel caso in cui Γ(u, v)
presenti solo componenti reali Ψ(u, v) = Γ(u, v), altrimenti
Ψ definito come segue:
Ψ(u, 2v) = Re (Γ(u, v))
Ψ(u, 2v + 1) = Im (Γ(u, v))
La Figura 3 fornisce una rappresentazione grafica di come
i coefficienti vengono riorganizzati in un nuovo blocco.
Questa scelta dettata dal fatto che la trasformazione
discreta di fourier, l’unica nel nostro caso a fornire coeffi-
cienti complessi, ha la propriet di concentrare la maggior
parte dell’informazione utile nei primi coefficienti situati
in prossimit dell’angolo superiore sinistro della matrice.
Thus, the LT distance function can be defined as follows:
LT =
1
n
2
n1
X
u=0
n1
X
v=0
£
Ψ
R
(u, v) Ψ
ˆ
R
(u, v)
¤
2
.
(4)
4
Im
00
*
Im
10
*
Im
20
*
Im
10
*
n
Im
10
*
n
Im
11
*
n
Im
12
*
n
Im
11
*
nn
Im
2
1
0
*
n
Im
2
1
1
*
n
Im
2
1
2
*
n
Im
2
1
1
*
n
n
Re
00
*
Re
10
*
Re
20
*
Re
10
*
n
Im
00
*
Im
10
*
Im
20
*
Im
10
*
n
Re
2
1
0
*
n
Re
2
1
1
*
n
Re
2
1
2
*
n
Re
2
1
1
*
n
n
Re
00
*
Re
10
*
Re
20
*
Re
10
*
n
*Re
*
Im
*
Re
10
*
n
Re
11
*
n
Re
12
*
n
Re
11
*
nn
Fig. 3. Reorganization of real and imaginary components by the Ψ operator.
B. Embedding quality measures in PIFS
In PIFS coding the whole image is partitioned in a sets of
ranges (as described in Section I). For each range, the coding
scheme looks for an approximating domain to be assigned to,
while the domain is mapped into the corresponding range by
an affine transformation. For a given range R, PIFS associates
that domain providing the smallest approximation error in
a root mean square sense, so exactly in that point it is
possible to embedd different quality measure to decide the
best range/domain association. The key idea underlying to
this strategy is that quality measures outperforming the RMSE
from a subjective point of view can better the subjective
appearence of the whole image by improving the quality
of each range. In other words, in the original definition of
the PIFS coding scheme a proposed by Jaquin, the range is
approximated by the transformation
ˆ
R = α·D+β by minimiz-
ing the error function kR (α · R + β)k
2
. In this paper, this
function has been replaced by 10 alternative functions. There
are two different ways of embedding new quality measures
in PIFS coding. In the former, α and β are even computed
by solving a mean square error problem while the distance
between the original and the transformed range is measured
by a new quality measure f (R,
ˆ
R). In the latter, formulas
to compute α and β are rewritten to properly minimize the
distance LT of the (4), so that the coding scheme may be
specialized even more by nesting this quality measure in a
deeper level. Thus, starting from the (4) and knowing that
Γ
ˆ
R
(u, v) = T (
ˆ
R) and
ˆ
R = α·D+
¯
β, it come out that Γ
ˆ
R
(u, v)
= T (
ˆ
R) = T (α · D +
¯
β) = α · Γ
D
+ B, where:
B =
½
β if u = v = 0
0 otherwise
Let be P the set of the all pairs {(u, v)}(0, 0) with u, v =
0, . . . , n 1, LT can be rewritten as follows:
LT =
1
n
2
R
(0, 0) (α · Ψ
D
(0, 0) + β)]
2
+
P
(u,v)P
R
(u, v) α · Ψ
D
(u, v)]
2
,
(5)
reorganizing all terms with respect to α and β, one obtains:
LT =
1
n
2
"
α
2
n1
X
u=0
n1
X
v=0
Ψ
D
(u, v)
2
+
2α
n1
X
u=0
n1
X
v=0
Ψ
R
(u, v
D
(u, v) + 2αβΨ
D
(0, 0)+
2βΨ
R
(0, 0) + β
2
+
n1
X
u=0
n1
X
v=0
Ψ
R
(u, v)
2
#
(6)
The values of α and β minimizing LT are given by:
α =
RD Ψ
R
(0, 0)Ψ
D
(0, 0)
D
2
Ψ
D
(0, 0)
2
β =
D
2
Ψ
R
(0, 0) RDΨ
D
(0, 0)
(D
2
Ψ
D
(0, 0)
2
)
(7)
where D
2
=
n1
X
u=0
n1
X
v=0
Ψ
D
(u, v)
2
and RD =
n1
X
u=0
n1
X
v=0
Ψ
R
(u, v
D
(u, v).
An important observation made in embedding linear trans-
form based measures in PIFS coding is that LT can give
PSNR values larger than that obtained from the RMSE even
though PSNR is maximized where the RMSE reaches its
minimum). The explanation of way this happens resides in the
range/domain matching process. As the coder find a domain
giving an approximation error lower than a fixed threshold,
the domain pool search stops and the range is coded by this
domain. The LT metric induces the coder to a thorough domain
search, since it is more selective than the RMSE and provides
a little approximation error (lower than the fixed threshold)
only for range/domain comparisons which results in small
RMSE values; on the other hand, the number of range/domain
matchings for each range is upperbounded by a fixed constant
l (50 in our case), so that the coding time is not significantly
affected by additional comparisons. Fig. 4 reports a graphical
example of this kind of situations.
6.5
12.6
5.7
8.6
9.4
15.2
3.5
6.8
6.5
12.6
5.7
8.6
9.4
15.2
3.5
6.8
7.2
10.1
1.8
4.2
Stop
Stop
RMSE search
LT search
RMSE
LT
RMSE
LT
Threshold
Th = 5.0
Fig. 4. LT and RMSE searching for a given range.
The main grounds can be found in that image distortions are
uniformly distributed in pictures coded with this measure, from
a subjective point of view. The reason of way this happens
5
resides in the image partitioning process. In other words, a
range is coded by the best approximating domain if the ap-
proximation error is lower than a given threshold, subsequently
partitioned otherwise. The RMSE and LT generally give dif-
ferent quad-tree partitioning for the same image. Particularly,
the LT partitioning is more balanced favouring midsize blocks.
Figure 5 shows two different quad-tree decompositions for the
lena image, the former is obtained from the LT measure while
the latter from the RMSE.
LT+FFT quad-tree
RMSE quad-tree
Fig. 5. LT and RMSE quad-tree partitioning of the Lena image.
A further example of the improvements in subjective quality
level provided by the LT measure is given in Figure 6 that
reports the eye regions of the mandrill image coded by the LT,
HSV, RMSE and PMSE quality measures at a compression
ratio of 20:1. This picture underlines as the overall quality
reached by the LT overcomes all the others.
LT+FFT
HSV
RMSE
PMSE
Fig. 6. Eyes of the mandrill decoded with LT, HSV, RMSE and PMSE
quality measure at a compression ratio of 20:1.
C. Codifica dell’informazione residua
L’integrazione delle trasformate lineari all’interno
dello schema di codifica frattale possibile anche ad un
terzo livello (i primi due sono l’accelerazione del processo
di codifica e la sostituzione della misura di qualit), che
quello della codifica dell’informazione residua. Nello
schema ibrido proposto, la trasformata T viene applicata
a tutti i domain all’interno del pool durante la fase di
indicizzazione, mentre durante la fase di codifica essa
viene applicata a ciascun range; ci implica che in fase
di codifica T (r) e T (d) sono entrambi noti, mentre la
trasformazione T (ˆr) del range approssimato ˆr, pu essere
ricavata senza alcun costo computazionale aggiuntivo
sfruttando le propriet di linearit della trasformata T gi
discusse nelle sezioni II. I coefficienti della trasformata
T possono essere ulterioremente utilizzati per conservare
parte dell’energia iniziale persa dalla codifica frattale. In
particolare, poich ˆr calcolato in modo da minimizzare
l’errore di approssimazione per il range r, ha senso
supporre che δ = T (ˆr) T (r) sia caratterizzato da
valori mediamente piccoli, quindi rappresentabili con un
contenuto numero di bit. Inoltre, in fase di decodifica
il range approssimato ˆr ottenuto mediante decodifica
frattale, per cui noto δ e ricalcolato T (ˆr) sarebbe
possibile recuperare r come T (ˆr) + δ. Risulta evidente
quindi che la definizione di una strategia efficiente che
permetta di immagazzinare se non tutta l’infomazione
contenuta in δ, almeno una sua parte, pur mantenendo un
vantaggioso rapporto qualit/bit-rate rappresenta il cuore
della discussione che segue. In linee generali, dunque, lo
schema di codifica del residuo il seguente:
Codifica
a) calcolo del δ
ij
= Γ
ij
r
Γ
ij
ˆr
, 0 ij n
b) se |δ
ij
| < ǫ poni δ
ij
= 0
c) i, j | δ
ij
> 0 scrivi δ
ij
in forma compatta
Decodifica
a) decodifica dell’mmagine I mediante PIFS
b) per ciascun range ˆr calcola Γ
ˆr
c) leggi δ dal file
d) sostituisci ˆr con T
1
ˆr
+ δ) nell’immagine I
L’aspetto che riveste maggior importanza costituito
dalla rappresentazione in maniera compatta dei valori δ
ij
all’interno del file. Di seguito discusso l’approccio utiliz-
zato nello schema proposto. Il primo passo nel processo di
codifica del residuo la scelta della soglia ǫ. Pi in dettaglio,
ciascun δ
ij
pesa in termini di bit in maniera proporzionale
al suo valore; ci implica che un valore elevato della soglia ǫ
lascer inalterati un maggiore numero di δ
ij
da scrivere nel
file, che si traduce una migliore qualit al costo di un mag-
giore bit-rate. Il criterio adottato per determinare la soglia
ǫ, in questo approccio, dipende fortemente dal modo in cui
vengono quantizzatti i valori δ
ij
. In seguito sar spiegato in
maggior dettaglio che il loro modulo |δ
ij
|, viene approssi-
mato con la quantit 3/2ǫ. Questo implica che la soglia ǫ
deve essere scelta in modo da massimizzare il numero di
coefficienti δ
ij
che cadono intorno a tale valore. Va inoltre
tenuto conto del fatto che quanto maggiore il modulo di
un coefficiente δ
ij
tanto pi importante la sua corretta
6
codifica. In questa ottica possibile definire un criterio per
la determinazione automatica del valore ǫ. Innanzitutto,
costruiamo un istogramma H, dove H(i) indica quanti
δ
ij
in tutta l’immagine hanno assunto valore 3/2|δ
ij
| =
i; dopodich calcoliamo ǫ = argmax
k
P
2k
i=k
(iH(i)). Un
maggior numero di parole va invece speso per questioni
pi delicate, quali la quantizzazione e la codifica dei valori
δ
ij
. In fase di codifica, dopo i passi a) e b), all’interno
di ciascun blocco sono presenti solo valori maggiori della
soglia ǫ. Dunque se un generico blocco non contiene alcun
δ
ij
> 0 esso non ha informazioni significative sul residuo e
non viene ulteriormente trattato, mentre all’interno del
file viene scritto un unico bit uguale a zero. In caso
contrario, nel file viene scritto un primo bit impostato
ad 1. Dato che, molte righe del blocco δ possono essere
vuote, distinguere le righe i contenenti δ
ij
6= 0 da quelle
con δ
ij
= 0, j permette un ulteriore risparmio di bit-
rate. Pi in dettaglio, per ciascuna riga i del blocco δ
viene scritto un bit b all’interno del file, dove b = 1 se
j | δ
ij
6= 0 e b = 0 altrimenti, per un totale di n bit
(n la taglia del lato del blocco δ). Inoltre, per ciascun
δ
ij
, si conserva solo il segno e la colonna j. Infatti, si
osserva che nel blocco δ si ha che |δ
ij
| > ǫ, i, j, mentre
per ǫ sufficientemente grandi si ha che |δ
ij
| < 2ǫ per la
maggior parte dei δ
ij
. Sulla base di queste osservazioni,
nell’approccio proposto i δ
ij
sono stati approssimati con
un valore costante 3/2ǫ (il centro dell’intervallo [ǫ, 2ǫ])
piuttosto che quantizzati singolarmente; ne scaturisce che
per ciascun coefficiente rimane da salvare nel file solamente
l’informazione relativa alla colonna j. Trattandosi per il
blocco δ di una matrice sparsa, ciascun elemento viene
salvato come una coppia di coordinate (i, j); tuttavia nel
caso specifico, avendo fatto distinzione fra righe vuote
e righe con almeno un coefficiente non nullo, possibile
ridurre ulteriormente il bit-rate. Infatti, i valori δ
ij
ven-
gono letti dal file consecutivamente, una riga dopo l’altra,
possibile quindi salvare nel file solo la colonna j di ciascun
coefficiente, dato che per i coefficienti sulla stessa riga, le
colonne sono in ordine crescente. Questo implica che se,
mentre si legge una sequenza di colonne, quella appena
letta minore della precedente, si tratta di un indice di
colonna di una riga successiva; mentre la riga corretta a
cui questa appartiene pu essere recuperata come la prima
non vuota (per la quale stato salvato precedentemente
un bit a 1). Si noti infatti che non necessariamente si
tratter della riga immediatamente successiva i + 1 essendo
frequenti le righe completamente vuote. In conclusione, per
ciascun blocco contenente informazione residua vengono
salvati solamente 1 + n + K(log
2
(n)), dove K il numero
di δ
ij
6= 0, altrimenti viene salvato solo un bit aggiuntivo
impostato a 0.
IV. EXPERIMENTAL RESULTS
Tests have been conducted on a dataset of twenty images,
twelve of them coming from the waterloo bragzone standard
database [15] and the remaining eight from the web. A large
variability in testing conditions has been ensured by selecting
test images containing patterns, smooth regions and details.
They are all 8-bit grayscale images at a resolution of 512×512
pixels. The performance of the algorithm has been assessed
under different points of view. The main aim of the test is
to underline the efficiency of the LT based feature vector and
the improvements given by LT based quality measures. The
compression ratio has been calculated as the ratio between
the original image size and the coded image size. Because
of the partial reversibility of the coding process, the fractal
compression of the image adds noise to the original signal.
Less added noise means greater image quality, and therefore
a better algorithm. Noise is usually measured by the Peak
Signal-to-Noise Ratio (PSNR), which in dB can be computed
as follows:
PSNR = 10 · log
10
Ã
M · N · 255
2
X
m,n
(s
m,n
s
m,n
)
2
!
,
where M and N are image width and height, 255 is the
maximum pixel value, s
m,n
is the pixel value in the original
image and
s
m,n
is the corresponding pixel in the decoded
image. In order to further assess the performance of the hybrid
scheme, we also compared it with the Saupe’s algorithm [20,
21].
In this first experiment, given a test image, it is decoded at
different compression ratios and corresponding PSNR values
are computed. This is repeated for all the quality measures
from Section III-A. Figure 7 reports the mean curves over
all test images. According to their performances in terms of
PSNR, quality measures can be grouped in three main classes:
class I: NK, CQ, SC, HOSAKA PLOTS;
class II: IF, AD, MD, PMSE;
class III: RMSE, HSV, MF.
The measures in class I provide very poor performances
from an objective point of view, with images showing coding
artifacts also for little compression ratios. Class II measures
always outperform class I, but the PSNR is still lower than the
one obtained from the RMSE, mainly when the compression
ratio increases. On the contrary, measures in class III show
the best PSNR curves, with FFT magnitude measure (MF)
outperforming the RMSE in most cases, while HSV and
RMSE are almost comparable. From Fig. 7, it obviously
come out that CQ, Hosaka, NK and SC quality measures
provide very scarce performances, that represents a further
confirmation of that measures from the group I are not
effective at all when applied into the PIFS coding. Notice
that excepting for the Hosaka Plots, none of the measures in
group I is based on the difference between the original and
the approximated ranges, but products and ratios are usually
involved instead, resulting in unstable quality measures when
small values are to the denominator. Fig. 7 also point out
that quality measures belonging to the group II have quite
comparable performances from an objective point of view.
They show better performances than the first group, but which
are not satisfactory enough. The main limitation of the AD and
MD could be seen in that they found on the absolute difference
between the original and the approximated range values, which
7
not exalt enough differences as well as the square difference
does. On the contrary, the substantial drawback of the IF and
PMSE is the ratio with the values of the original range that may
be near to zero resulting in a very large distance. Therefore,
the group III still represents the set of best candidate measures
to be embedded into the PIFS scheme. In particular HSV
and RMSE are almost comparable in performances, while
FFT-MP significantly outperforms both. There are two main
reasons motivating the superiority of the FFT-MP: a) the FFT
transform retains the most of the image information in its first
coefficients, which get it more robust with respect to small
changes in details than the RMSE, by principally characteriz-
ing low frequencies; b) the ease in computing it by summing
on square differences. Indeed, even though the HSV is based
on the DCT (Discrete Cosine Transform) that often provides
better performances in several image processing applications
(image coding, filtering, indexing), it is less effective than
FFT-MP, probably due to the complexity of the model.
5 10 15 20 25 30 35 40 45 50
27
28
29
30
31
32
33
34
35
36
37
38
Mean PSNR over all the images
Comprassion Ratio
PSNR
AD
CQ
HOSAKA
HSV
IF
MD
FFT−MP
NK
PMSE
RMESE
SC
Fig. 7. Average PSNR curves over all the test images.
Nel secondo esperimento le tre varianti della nuova
misura di qualit basata sulle propriet delle trasformazioni
lineari, sono state messe a confronto con la misura RMSE
utilizzata nel modello di base della codifica frattale. I
risultati presentati in Fig. 8 sottolineano X aspetti di
particolare interesse. In primo luogo, le prestazioni for-
nite dalla trasformata di Haar, sono particolarmente sca-
denti; ci probabilmente dovuto alla particolare semplicit
della trasformazione che, in questo caso, non cattura
nei suoi coefficienti sufficiente informazione utile vagliare
significativamente la quantit di distorsione introdotta
dall’approssimazione range/domain. D’altro canto, emerge
invece che RMSE e LT+DCT forniscono prestazioni in
termini di PSNR pressocch comparabili; ci costituisce
una ulteriore conferma di quanto gi asserito preceden-
temente riguardo al modello HSV, ossia che l’applicazione
della trasformata coseno non portano necessariamente ad
un miglioramento oggettivo della qualit dell’immagine,
sebbene da un punto di vista soggettivo i risultati sem-
brano essere migliori rispetto al RMSE. Ancora una
volta, dunque, LT+FFT sembra essere la scelta migliore.
Sebbene, per valori piccoli del rapporto di compressione 1 :
18 : 1 essa sia leggermente al disotto del RMSE, per valori
superiori si ottiene un costante miglioramento rispetto
a quest’ultimo in termini di PSNR. Sperimentalmente si
osservato che tenendo separate la parte reale e la parte
immaginaria dei coefficienti, nella maniera esposta in III-
A, i risultati sono migliori rispetto alla misura FFT-MP
sia da un punto di vista oggettivo che soggettivo; ci si
spiega soprattutto col fatto che LT+FFT per come definita
conserva i segni dei singoli coefficienti e non fonde insieme
le informazioni provenienti da entrambe le componenti
di ciascun valore complesso, cosa che avviene invece
in FFT-MP, la quale ricorre al modulo dei coefficienti
complessi. Inoltre, tagliando la parte del blocco contenente
i coefficienti pi piccoli, relativi alle alte frequenze, essa
conserva solo l’informazione realmente utile alla stima
della qualit visiva dell’approssimazione.
0 5 10 15 20 25 30 35 40 45 50
28
29
30
31
32
33
34
35
36
37
38
Average PSNR/CR curve
Compression Ratio
PSNR
LT+DCT
LT+FFT
LT+Haar
RMSE
Fig. 8. Average PSNR performance of RMSE and LT based quality measures
(FFT, DCT, Haar) over all the test images.
Dalla formula dell’equazione 5 si evince che LT si
annulla quando T (α · D +
¯
β) T (R) ossia quando
α·D+
¯
β R, che la stessa condizione alla base del RMSE,
per cui, come ci si aspetterebbe, LT (R, α · D +
¯
β) 0
quando RMSE(R, α · D +
¯
β) 0. In particolare, da un
punto di vista sperimentale si osservato che, per valori
sufficientemente piccoli del RMSE (< 30 nel nostro caso),
i valori di α e β calcolati con le formule (2) e (7) tendono
a coincidere quando RMSE(R, α · D +
¯
β) 0. Tuttavia, i
valori forniti dalle formule (7) danno PSNR leggermente pi
bassi, ma con una qualit soggettiva in alcuni casi migliore.
La Figura 9 mostra due dettagli dell’immagine di test
Barbara codificata con le tre varianti della misura basata
su LT (DCT, FFT e Haar) e con la misura RMSE.
Nell’ultimo esperimento stato valutato il contributo
offerto dalla codifica dell’informazione residua mediante
la tecnica descritta in III-C. La Figura 10 mostra i
valori medi di PSNR/CR ottenuti sulle 20 immagini di
test; ciascun quadrante della figura riporta i risultati
ottenuti con una delle quattro misure senza la codi-
8
L
T
+
F
F
T
L
T
+
H
a
ar
R
M
S
E
L
T
+
D
C
T
Fig. 9. Comparison between LT based quality measures and the RMSE
by a magnification of two different regions from the Barb image (with the
compression ratio fixed at 12:1).
fica dell’informazione residua, con l’informazione residua
codificata tramite la DCT o con l’informazione residua
codificata mediante trasformata di Haar. In particolare si
osserva che il contributo dato dalla codifica del residuo
significativo soprattutto per rapporti di compressione
relativamente bassi. Questo da inputarsi alla natura
stessa del metodo utilizzato, in quanto al crescere del
rapporto di compressione la distorsione introdotta dalla
codifica frattale aumenta significativamente facendo am-
pliare l’intervallo [ǫ, 2ǫ] in cui sono distribuiti i coefficienti
(in valore assoluto) del residuo da codificare e che risultano
difficilmente approssimabili mediante un’unica coppia di
valori {−3/2ǫ, +3/2ǫ}. Si osserva inoltre che le prestazioni
fornite dalla DCT o dalla trasformata di Haar, nella cod-
ifica dell’informazione residua sono pressocch analoghe.
V. CONCLUSIONS
La codifica frattale rappresenta un ambito di ricerca
particolarmente fertile, grazie alle sue innumerevoli appli-
cazioni collaterali. La maggior parte dei lavori presentati
in letteratura si prefigge come scopo quello di velocizzare
la fase di codifica, la cui lentezza rappresenta il collo
di bottiglia nelle applicazioni pratiche dei PIFS; tuttavia
poco stato fatto per ci che concerne il miglioramento della
qualitd i codifica dell’immagine. In questo lavoro, i PIFS e
letrasformazioni lineari vengono combinati insieme al fine
di migliorare la qualit oggettiva e soggettiva dell’immagine.
Le trasformazioni lineari vengono integrate a due livelli
differenti; in una prima fase esse vengono utilizzate nella
definizione di una misura di qualit da sostituire al RMSE
comunemente usato, mentre in un secondo momento esse
0 10 20 30 40 50
28
30
32
34
36
38
Compression Ratio
PSNR
DCT Quality Measure
DCT
DCT+DCT−Res.
DCT+Haar−Res.
0 10 20 30 40 50
28
30
32
34
36
38
Compression Ratio
PSNR
FFT Quality Measure
FFT
FFT+DCT−Res.
FFT+Haar−Res.
0 10 20 30 40 50
28
30
32
34
36
38
Compression Ratio
PSNR
Haar Quality Measure
Haar
Haar+DCT−Res.
Haar+Haar−Res.
0 10 20 30 40 50
28
30
32
34
36
38
40
Compression Ratio
PSNR
RMSE Quality Measure
RMSE
RMSE+DCT−Res.
RMSE+Haar−Res.
Fig. 10. PSNR versus Compression Ratio with and without LT based coding
of the residual information.
permettono la codifica dell’informazione residua. I risultati
sperimentali, sono stati condotti in maniera da mettere in
evidenza come ciascuno di queste due forme di integrazione
contribuisce al miglioramento della qualit tanto oggettiva
quanto soggettiva dell’immagine. Inoltre, nel caso specifico
sono state considerate tre diverse trasformazioni lineari
(FFT, DCT, Haar), allo scopo di mettere in evidenza il
peso che ciascuna di esse pu avere sulla fase di codifica. I
risultati ottenuti in questo studio rappresentano inoltre una
base incoraggiante per ulteriori approfondimenti, come ad
esempio l’utilizzo delle trasformate lineari per lo speed-up
della codifica, al fine di ottenere una integrazione completa
fra PIFS e trasformazioni lineari.
REFERENCES
[1] Avcibas
˙
I., Sankur B., and Sayood K., Statistical evaluation of image
quality measures, in Journal of Electronic Imaging, vol. 11, no. 2,
pp. 206-23, April 2002.
[2] Aggarwal C., On the Effects of Dimensionality Reduction on High
Dimensional Search, IBM T. J. Watson Research Center, ACM PODS
Conference, YorkTown, pp. 1–11, 2001.
[3] Bani-Eqbal B., Speeding Up Fractal Image Compression, in Proceedings
of the IS&T/SPIE 1995, Symposium on Electronic Imaging: Science &
Technology, vol. 2418, San Jose, California, pp. 67–74, September 1995.
[4] Bentley J. L., Multidimensional Binary Search Trees Used for Asso-
ciative Searching, in Communications of the ACM , vol. 18, no. 9,
pp. 509–517, September 1975.
[5] De Oliveira J. F. L., Mendoc¸a G. V. and Dias R. J., A Modified Fractal
Transformation to Improve the Quality of Fractal Coded Images, in
IEEE Signal Processing Society 1998 International Conference on Image
Processing, pp. 4–7, October 1998.
[6] Distasi R., Nappi M., Tucci M., FIRE: Fractal Indexing with Robust
Extensions for Image Databases, in IEEE Transactions on Image Pro-
cessing, vol. 12, Issue 3, pp. 373–384, March 2003.
[7] Eskicioglu A.M. and Fisher P.S., Image quality measures and their
performance, in IEEE Transactions on Communications, vol.43, no. 12,
pp. 2959–2965, December 1995.
[8] Fisher Y., Fractal Image Compression: Theory and Application,
Springer-Verlag, New York, 1994.
[9] Hamzaoui R., Hartenstein H., and Saupe D., Local Iterative Improve-
ment of Fractal Image Codes, in Image and Vision Computing, 18(6/7),
pp. 565–568, 2000.
9
[10] Hamzaoui R., Saupe D. and Hiller M., Distortion Minimization with Fast
Local Search for Fractal Image Compression, in Journal of Visual Com-
munication and Image Representation(Academic Press), JVCIR(12), no.
12, pp. 450–468, December 2001.
[11] Hamzaoui R., Saupe D. and Hiller M., Fast code enhancement with
local search for fractal image compression, in Proceedings of IEEE
International Conference on Image Processing (ICIP-2000), vol. 2,
pp. 156–159, 2000.
[12] Hartenstein H., Ruhl M. and Saupe D., Region-based fractal image
compression, in IEEE Transactions on Image Processing, vol. 9, no.
7 , pp. 1171–1184, Jul 2000.
[13] Hartenstein H., Ruhl M., Saupe D. and Vrscay E. R., On the Inverse
Problem of Fractal Compression, in Ergodic Theory, Analysis, and Effi-
cient Simulation of Dynamical Systems, Bernold Fiedler (ed.), Springer
Verlag, August 2001.
[14] Komleh H. E., Chandran V., Sridharan S., Face Recognition Using
Fractal, in Proceedings of IEEE International Conference on Image
Processing (ICIP 2001), vol. 3, no. 1, pp. 58–61, 7-10 October 2001.
[15] Kominek J., Waterloo BragZone and Fractals Repository,
http://links.uwaterloo.ca/bragzone.base.html, 25 January 2007.
[16] Nguyen K. G., Saupe D., Adaptive post-processing for fractal image
compression, Proceedings of IEEE International Conference on Image
Processing (ICIP 2000), September 2000.
[17] Nill N. B., A visual model weighted cosine transform for image
compression and quality assessment, in IEEE Transactions on Com-
munications, vol. 3, no. 6, pp. 551-557, June 1985.
[18] Polvere M. and Nappi N., Speed-Up In Fractal Image Coding: Com-
parison of Methods, in IEEE Transactions on Image Processing, vol. 9,
no. 6, pp. 1002–1008, June 2000.
[19] Popescu C., Dimcac A. and Yan H., A Nonlinear Model for Fractal
Image Coding, in IEEE Transactions on Image Processing, vol. 6, no.
3, pp. 373–382, March 1997.
[20] Riccio D., Nappi M., Defering Range/Domain Comparison in Fractal
Image Compression, in Proceedings from International Conference on
Image Analysis and Processing, Mantova Italy, September 2003.
[21] Distasi R., Nappi M. and Riccio D., A Range/Domain Approximation
Error Based Approach for Fractal Image Compression, in IEEE Trans-
action on Image Processing, vol. 15, No. 1, pp. 89–97, January 2006.
[22] Saupe D., Hamazoui R., Complexity Reduction Methods for Fractal
Image Compression, in I.M.A. Conference Proceedings on Image Pro-
cessing: Mathematical Methods and Applications, pp. 1–24, September
1994.
[23] Roberts S., Richard E., Independent Component Analysis. Principles and
Practice, Cambridge University Press, Cambridge, UK, 2000.
[24] Brendt Wohlberg and Gerhard de Jager, Fast image domain fractal
compression by DCT domain block matching, in Electronics Letters,
vol. 31, no. 11, pp. 869–870, May 1995.
[25] Wu J.-L. and Duh W.-J., Feature extraction capability of some discrete
transforms, in Proceedings of the IEEE International Symposium on
Circuits and Systems, vol. 5, pp. 2649–2652, June 1991.
... For some images which have less structure-similarity, the reconstructed image quality of FIC is usually unacceptable. Many researchers also made their efforts to improve the reconstructed image quality of FIC [17][18][19][20]. One natural method to improve image quality is to reduce the size of range blocks [17]. ...
... Besides, Tong and Pi proposed a hybrid fractal image compression scheme with linear predictive coding in 2003 [18], which yielded better compression quality at the same compression ratio. By integrating partitioned iterated function system and linear transforms, Nappi et al enhanced both subjective and objective image quality in 2007 [19]. Wang et al proposed a SSIM-based scheme in 2011 [20], in which the reconstructed image quality is more appropriate for human visual system (HVS). ...
Article
Full-text available
As a structure-based image compression technology, fractal image compression (FIC) has been applied not only in image coding but also in many important image processing algorithms. However, two main bottlenecks restrained the develop and application of FIC for a long time. First, the encoding phase of FIC is time-consuming. Second, the quality of the reconstructed images for some images which have low structure-similarity is usually unacceptable. Based on the absolute value of Pearson’s correlation coefficient (APCC), we had proposed an accelerating method to significantly speed up the encoding of FIC. In this paper, we make use of the sparse searching strategy to greatly improve the quality of the reconstructed images in FIC. We call it the sparse fractal image compression (SFIC). Furthermore, we combine both the APCC-based accelerating method and the sparse searching strategy to propose the fast sparse fractal image compression (FSFIC), which can effectively improve the two main bottlenecks of FIC. The experimental results show that the proposed algorithm greatly improves both the efficiency and effectiveness of FIC.
Article
In recent years, numerous fractal image compression (FIC) schemes and their applications in image processing have been proposed. However, traditional FIC ignores the importance of affine parameters in the iterated function system (IFS) and affine parameters are kept invariant for a certain image in almost all of these schemes. By analyzing fractal compression technology in this paper, we show that the affine parameters in IFS can vary with different image quality measurements. A positive correlation exists between the image contrast of fractal decoded image and affine scalar multiplier. This strong correlation demonstrates that an image can be sharpened or smoothed using fractal compression technology.
Article
In this work we comprehensively categorize image quality measures, extend measures defined for gray scale images to their multispectral case, and propose novel image quality measures. They are categorized into pixel difference-based, correlation-based, edge-based, spectral-based, context-based and human visual system (HVS)-based measures. Furthermore we compare these measures statistically for still image compression applications. The statistical behavior of the measures and their sensitivity to coding artifacts are investigated via analysis of variance techniques. Their similarities or differences are illustrated by plotting their Kohonen maps. Measures that give consistent scores across an image class and that are sensitive to coding artifacts are pointed out. It was found that measures based on the phase spectrum, the multiresolution distance or the HVS filtered mean square error are computationally simple and are more responsive to coding artifacts. We also demonstrate the utility of combining selected quality metrics in building a steganalysis tool.
Article
Utilizing a cosine transform in image compression has several recognized performance benefits, resulting in the ability to attain large compression ratios with small quality loss. Also, incorporation of a model of the human visual system into an image compression or quality assessment technique intuitively should (and has often proved to) improve performance. Clearly, then, it should prove highly beneficial to combine the image cosine transform with a visual model. In the past, combining these two has been hindered by a fundamental problem resulting from the scene alteration that is necessary for proper cosine transform utilization. A new analytical solution to this problem, taking the form of a straightforward multiplicative weighting function, is developed. This solution is readily applicable to image compression and quality assessment in conjunction with a visual model and the image cosine transform. In the development, relevant aspects of a human visual system model are discussed, and a refined version of the mean square error quality assessment measure that should increase this measure's utility is given.
Article
In fractal image compression, the code is given by a contractive affine mapping whose fixed point is an approximation to the original image. Usually, the mapping is found by the collage coding method. We propose an algorithm that starts from an initial mapping obtained by collage coding and iteratively provides a sequence of contractive mappings whose fixed points are monotonically approaching the original image. Experimental results show that the rate-distortion improvement over collage coding is significant. Key words: Fractal image compression; combinatorial optimization; local search 1 Introduction and previous work In fractal image compression (6; 3), the code for an image x is an efficient binary representation of a contractive affine mapping T whose unique fixed point (or attractor) x T is a good approximation to x. The decoding is based on the contraction mapping principle, which gives x T as the limit point of the sequence of iterates fx n g n0 ; where x n+1 = T (x n ) and ...
Article
Fractal image compression has received much attention from the research community because of some desirable properties like resolution independence, fast decoding, and very competitive rate-distortion curves. Despite the advances made, the long computing times in the encoding phase still remain the main drawback of this technique. So far, several methods have been proposed in order to speed-up fractal image coding. We address the problem of choosing the best speed-up techniques for fractal image coding, comparing some of the most effective classification and feature vector methods-namely Fisher (1994), Hurtgen (1993), and Saupe (1995, 1996)-and a new feature vector coding scheme based on the block's mass center. Furthermore, we introduce two new coding schemes combining Saupe with Fisher, and Saupe with mass center coding scheme. Experimental results demonstrate both the superiority of feature vector techniques on classification and the effectiveness of combining Saupe and the mass center coding scheme, an approach that exhibits the best time-distortion curves
Article
This chapter describes a quadtree-based fractal encoding scheme. The scheme is an extension of the one presented in [45] and is similar to that discussed in [26], [27], and in Chapters 7, 8, and 9. While the results are not optimal, this scheme serves as a reference point for other fractal schemes. It is also a “typical” fractal scheme and a good first step for those wishing to implement their own. The code used to generate the results presented in this chapter is given in Appendix A, along with a detailed explanation of its inner workings.
Article
Utilizing a cosine transform in image compression has several recognized performance benefits, resulting in the ability to attain large compression ratios with small quality loss. Also, incorporation of a model of the human visual system into an image compression or quality assessment technique intuitively should (and has often proven to) improve performance. Clearly, then, it should prove highly beneficial to combine the image cosine transform with a visual model. In the past, combining these two has been hindered by a fundamental problem resulting from the scene alteration that is necessary for proper cosine transform utilization. A new analytical solution to this problem, taking the form of a straightforward multiplicative weighting function, is developed in this paper. This solution is readily applicable to image compression and quality assessment in conjunction with a visual model and the image cosine transform. In the development, relevant aspects of a human visual system model are discussed, and a refined version of the mean square error quality assessment measure is given which should increase this measure's utility.
Conference Paper
The discrete fractal transformation has recently emerged as a powerful technique for coding images. The scheme works by dividing an image into blocks and making use of a contraction mapping by which the domain blocks are mapped into range blocks. During this process it is usual to filter the domain blocks. In this work, we show how a change in the filtering procedure can improve the image quality of fractal coding methods based on the mentioned transformation. After a presentation of the discrete fractal transformation, we explain the change and apply it to the quadtree-based fractal scheme obtaining improvements in PSNR that can reach 1 dB when compared with the results obtained by traditional filtering
Article
Optimal fractal image coding is an NP-hard combinatorial optimization problem, which consists of finding in a finite set of contractive affine mappings one whose unique fixed point is closest to the original image. Current fractal image schemes are based on a greedy suboptimal algorithm known as collage coding. In a previous paper, Hamzaoui, Hartenstein, and Saupe proposed a local search algorithm that iteratively improves an initial solution found by collage coding. For a standard fractal scheme based on quadtree image partitions, peak-signal-to-noise ratio (PSNR) gains are up to 0.8 dB. However, the algorithm is time-consuming because it involves many iteration steps, each of which requires the computation of the fixed point of an affine mapping. In this paper, we provide techniques that drastically reduce the complexity of the algorithm. Moreover, we show that the algorithm is also successful with a state-of-the-art fractal scheme based on highly adaptive image partitions.