ArticlePDF Available

High-quality multi-pass image resampling

Article

High-quality multi-pass image resampling

Abstract and Figures

This paper develops a family of multi-pass image resampling algorithms that use one-dimensional filtering stages to achieve high-quality results at low computational cost. Our key insight is to perform a frequency-domain analysis to ensure that very little aliasing occurs at each stage in the multi-pass transform and to insert additional stages where necessary to ensure this. Using one-dimensional resampling enables the use of small resampling kernels, thus producing highly efficient algorithms. We compare our results with other state of the art software and hardware resampling algorithms.
Content may be subject to copyright.
High-quality multi-pass image resampling
Richard Szeliski, Simon Winder, and Matt Uyttendaele
February 2010
Technical Report
MSR-TR-2010-10
This paper develops a family of multi-pass image resampling algorithms that use
one-dimensional filtering stages to achieve high-quality results at low computational
cost. Our key insight is to perform a frequency-domain analysis to ensure that very
little aliasing occurs at each stage in the multi-pass transform and to insert additional
stages where necessary to ensure this. Using one-dimensional resampling enables
the use of small resampling kernels, thus producing highly efficient algorithms. We
compare our results with other state of the art software and hardware resampling algo-
rithms.
Microsoft Research
Microsoft Corporation
One Microsoft Way
Redmond, WA 98052
http://www.research.microsoft.com
1 Introduction
Current texture mapping hardware normally uses MIP-mapping (Williams 1983), sometimes com-
bined with multiple-sample anisotropic filtering (Barkans 1997). Unfortunately, these algorithms
sometimes produce considerable aliasing in areas of high-frequency content. Better filters, such as
Elliptical Weighted Average (EWA) filter (Greene and Heckbert 1986)), have been developed, but
even these produce visible artifacts or excessive blurring when textures are animated. While the
theory of high-quality image image filtering is well known (Mitchell and Netravali 1988, Heckbert
1989, Wolberg 1990), it is usually applied only to (separable) image rescaling, with sub-optimal
texture-mapping algorithms being used in other cases.
Today’s CPU multi-media accelerators and GPUs have more than enough power to support
better resampling algorithms, especially for applications where visual quality is important, such as
photo manipulation, animated slideshows, panoramic image and map viewing, and visual effects.
What is missing are algorithms that can efficiently filter away high-frequency content that might
cause aliasing while simultaneously preserving important texture details.
In this paper, we develop a family of high-quality multi-pass texture mapping algorithms, which
use a series of one-dimensional filtering stages to achieve good efficiency while maintaining high
visual fidelity. The key to our approach is to use Fourier analysis to ensure that none of the stages
performs excessive blurring or aliasing, so that the resulting resampled signal contains as much
high-frequency detail as possible while avoiding aliasing. Figures 4 and 8 show the basic idea.
The image is first upsampled to prevent aliasing during subsequent shearing steps, and then down-
sampled to its final size with high-quality low-pass filtering. While this paper focuses on the
case of affine transforms, the basic approach can be extended to full perspective, as discussed in
Section 8.
2 Previous work
Image resampling (a.k.a. image warping or texture mapping) algorithms have been under active
development for almost three decades, and several good surveys and textooks on these subjects
1
can be found (Heckbert 1989, Wolberg 1990, Dodgson 1992, Akenine-M
¨
oller and Haines 2002).
These algorithms fall roughly into four distinct categories: image filtering, multi-pass transforms,
pyramidal MIP-mapping, and elliptical weighted averaging.
Image filtering algorithms focus on optimizing the shape of interpolation and/or low-pass fil-
ters to minimize a number of competing visual artifacts, such as aliasing, ringing, and blurring.
Mitchell and Netravali (1988) introduce a taxonomy for these artifacts, and also design a cubic
reconstruction filter that heuristically optimizes some of these parameters. (Dodgson(1992) has
a more extended discussion of visual criteria.) Monographs and surveys on image resampling
and warping such as (Heckbert 1989, Wolberg 1990, Dodgson 1992, Akenine-M
¨
oller and Haines
2002) have nice tutorial sections on image filtering and reconstruction, as do classic image process-
ing books (Oppenheim et al. 1999) and research papers in image processing and computer vision
(Unser 1999, Triggs 2001). The first four books also cover the topic of geometric transforms which
underlie affine (and more general) image warping.
Multi-pass or scanline algorithms use multiple one-dimensional image re-scaling and/or shear-
ing passes, combined with filtering of varying quality, to implement image rotations and other
affine or non-linear image transforms. Heckbert (1989) and Wolberg (1990) both have nice re-
views of these algorithms, including the seminal two-pass transform developed by Catmull and
Smith (1980). Unfortunately, none of these techniques use high-quality (multi-tap) image filters
inside their one-dimensional resampling stages, nor do they account for aliasing in the orthogonal
dimension during shearing (see Section 5). It is somewhat surprising that no one has so far merged
high quality image filtering with scanline algorithms, which is what this paper aims to do.
MIP-mapping algorithms (Williams 1983) construct an image pyramid ahead of time, which
make subsequent downsampling operations more efficient by avoiding the need for large low-pass
filter kernels. In its usual form, tri-linear filtering is used. First, the two nearest pyramid levels are
found, using a heuristic rule based on the local affine warp being performed (Ewins et al. 1998).
The results of bi-linearly interpolating each of these two images is then linearly blended. Unfortu-
nately, the bi-linear resampling introduces aliasing, while blending imagery from the coarser level
introduces additional blur. Using (4 × 4) bi-cubic interpolation has been proposed, but according
2
H
2
i
f
xxxi
f’g
1
g
2
g
3
u
F
u
G
1
u
G
2
u
G
3
u
F’
H
1
interpolate
* h
1
(x)
warp
ax+t
filter
* h
2
(x)
sample
* δ(x)
(f) (g) (h) (i) (j)
(a) (b) (c) (d) (e)
Figure 1: One-dimensional signal resampling: (a) original sampled signal f(i); (b) interpolated
signal g
1
(x); (c) warped signal g
2
(x); (d) filtered signal g
3
(x); (e) sampled signal f
0
(i). The
corresponding spectra are shown below the signals in figures (f–j), with the aliased portions shown
in red.
.
to Akenine-M
¨
oller and Haines (2002), this option is not widely available in hardware.
The performance of MIP-mapping degrades even further when the resampling becomes aniso-
tropic. Ripmapping (Akenine-M
¨
oller and Haines 2002) extends the idea of the pyramidal MIP-map
by creating rectangular smaller images as well. While this requires a 300% memory overhead (as
opposed to only 30% for MIP-maps), it produces better-quality results when images are zoomed
anisotropically (changes in aspect ratios). For general skewed anisotropy, a variety of multi-sample
anisotropic filters have been proposed (Schilling et al. 1996, Barkans 1997). While these offer a
noticeable improvement over regular MIP-mapping in heavily foreshortened regions, they still
suffer from the aliasing introduced by low-quality tri-linear filters (see our Experimental Results
section).
Finally, Elliptic Weighted Average (EWA) filters convolve the image directly with a non-
separable oriented (skewed) Gaussian filter (Greene and Heckbert 1986). While this has the rep-
utation in some quarters of producing high quality results (Akenine-M
¨
oller and Haines 2002),
Gaussian filtering is known to simultaneously produce both aliasing and blurring. Since the filter
3
is isotropic in the warped coordinates, it incorrectly filters out corner frequencies in the spectrum
and, being non-separable, the na
¨
ıve implementation of EWA is also quite slow, although faster
algortihms based on MIP-mapping have recently been proposed (McCormack et al. 1999, H
¨
uttner
and Straßer 1999).
The remainder of the paper is structured as follows. Sections 3 and 4 review the basics of
one-dimensional and (separable) two-dimensional image resampling. Section 5 presents our novel
three-pass optimal resampling algorithm for performing one-dimensional image shears, while Sec-
tion 6 builds on these results to develop an efficient four-pass general affine resampling algorithm.
Section 7 uses a variety of test images and motions to compare our algorithm to previously de-
veloped state of the art resampling algortihms. We close with a discussion of future directions for
research that this work suggests.
3 One-dimensional resampling
Before we describe our new algorithms, we first briefly review the theory of optimal one-dimensional
signal resampling. We use the framework shown in Figure 1, which Heckbert (1989) calls ideal
resampling and Dodgson (1992) calls the four part decomposition (and both attribute to (Smith
1981)).
The original source image (or texture map) is a sampled signal f (i), as shown in Figure 1a.
Because the signal is sampled, its Fourier transform is infinitely replicated along the frequency
axis, as shown in Figure 1f.
To resample the signal, we first (conceptually) convert it into a continuous signal g(x) by
convolving it with an interpolation filter, h
1
(x),
g
1
(x) =
X
i
f(i)h
1
(x i), (1)
as shown in Figure 1b. In the frequency domain, this corresponds to multiplying the original signal
spectrum F (u) = F{f(x)}, with the spectrum of the interpolation filter H
1
(u) to obtain
G
1
(u) = F (u)H
1
(u), (2)
4
as shown in Figure 1f–g.
If the filter is of insufficient quality, phantom replicas of the original spectrum persist in higher
frequencies, as shown in red in Figure 1g. These replicas correspond to the aliasing introduced
during the interpolation process, and are often visible as unpleasant discontinuities (jaggies) or
motion artifacts (crawl).
Examples of interpolation filters include linear interpolation, cubic interpolation (Mitchell and
Netravali 1988), and windowed sinc interpolation (Oppenheim et al. 1999). A complete discussion
of the merits of various one-dimensional interpolation filters is beyond the scope of this paper, since
they have been widely studied in the fields of signal processing (Oppenheim et al. 1999, Wolberg
1990, Dodgson 1992), image processing (Unser 1999, Triggs 2001) and graphics (Heckbert 1986).
In this paper, we use a raised cosine-weighted sinc filter with 4 cycles (9 taps when interpolating).
The next step is to apply a spatial transformation to the original signal domain, e.g.,
x = ax
0
+ t, (3)
which is an affine spatial warp. (Other transformation, such as perspective or arbitrary warps are
also possible (Heckbert 1989, Wolberg 1990).) Note how we always specify the inverse warp, i.e.,
the mapping from final pixel coordinates x
0
to original coordinates x.
The warped or transformed continuous signal and its Fourier transform (in the affine case) are
g
2
(x
0
) = g
1
(ax
0
+ t) G
2
(u) =
1
a
G
1
(u/a)e
jut/a
. (4)
If the original signal is being compressed (Figure 1c), the Fourier transform becomes dilated
(stretched) along the frequency axis (Figure 1h).
Before resampling the warped signal, we pre-filter (low-pass filter) it by convolving it with
another kernel,
g
3
(x) = g
2
(x) h
2
(x) G
3
(u) = G
2
(u)H
2
(u) (5)
(Figure 1d/i). This is particularly necessary if the signal is being minified or decimated, i.e., if a > 1
in (3). If this filtering is not performed carefully, some additional aliasing may be introduced into
the final sampled signal (Figure 1j).
5
ai+ta(i–1)+t
Figure 2: Polyphase filtering. The coefficients used from h(x) for the black sample point at ai + t
and the red sample point at a(i 1) + t are different, and would be stored in different phases of
the two-dimensional polyphase lookup table h
P
(k; φ).
Fortunately, because of the linearity of convolution operators, the three stages of filtering and
warping can be combined into a single composite filter
H
3
(u) = H
1
(u/a)H
2
(u), (6)
which is often just a scaled version of the original interpolation filter h
1
(x).
1
The final discrete
convolution can be written as
f
0
(i) = g
3
(i) =
X
j
h
3
(ai + t j)f(j) =
X
j
h([ai + t j]/s)f(j), (7)
where s = max(1, |a|).
The filter in (7) is a polyphase filter, since the filter coefficients being multiplied with the input
signal f(j) are potentially different for every value of i (Figure 2). To see this, we can re-write (7)
as
f
0
(i) =
X
j
h([ai + t j]/s)f(j) =
X
j
h
P
(j
j; φ)f(j), (8)
where
j
= bai + tc, (9)
φ = ai + t j
, and (10)
h
P
(k; φ) = h([k + φ]/s). (11)
1
For ideal (sinc) reconstruction and low-pass filtering, the Fourier transform is a box filter of the smaller width,
and hence the combined filter is itself a sinc (of larger width).
6
The values in h(k; φ) can be precomputed for a given value of s and stored in a two-dimensional
look-up table.
2
The number of discrete fractional values of φ that need to be stored is related to the
desired precision of the convolution, and is typically 2
b
, where b is the number of bits of desired
precision in the output (say 10-bits for 8-bit RGB images to avoid error accumulation in multi-pass
transforms).
We can write the above formula (7) in a functional form
f
0
= R(f; h, s, a, t). (12)
In other words, R is an algorithm, parameterized by a continuous filter kernel h(x), scale factors s
and a, and a translation t, which takes an input signal f (i) and produces an output signal f
0
(i). This
operator is generalized in the next section to a pair of horizontal and vertical scale/shear operators,
which are the basic building blocks for all subsequent algorithms.
4 Two-dimensional zooming
In this section, we review two-pass separable transforms, which can be accomplished by first
resampling the image horizontally and then resampling the resulting image vertically (or vice-
versa). We can perform these operations by extending our one-dimensional resampling operator
(12) to a pair of horizontal and vertical image resampling operators,
f
0
= R
h
(f, h, s, a
0
, a
1
, t) (13)
f
0
(i, j) =
X
k
h(s[a
0
i + a
1
j + t k])f(k, j) and
f
0
= R
v
(f, h, s, a
0
, a
1
, t) (14)
f
0
(i, j) =
X
k
h(s[a
1
i + a
0
j + t k])f(i, k).
Note that these operators not only support directional scaling and translation, but also support
shearing (using a different translation for each row or column), which is used for more complex
transformations.
2
The values of h(k; φ) should be re-normalized so that
P
k
h(k; φ) = 1.
7
Figure 3: Image magnification results: (a) original chirp image; (b) tri-linear MIP-mapping;
(c) EWA filtering; (d) windowed sinc function. Notice how the other techniques produce either
excessive blur or aliasing. (Please look at these results by magnifying your document viewer.)
Image minification (zooming out) can be made more efficient using MIP-maps (Williams 1983)
or ripmaps (Akenine-M
¨
oller and Haines 2002), as described in Section 2.
Figure 3 shows some examples of image magnification using tri-linear MIP-mapping, EWA
filtering, and windowed sinc low-pass filtering. Note how the windowed sinc function produces
the least aliasing and blur.
5 Shear
In order to better explain our general affine multi-pass resampling algorithm, we start with the
simpler case of a pure horizontal shear,
x
y
=
a
0
a
1
t
0 1 0
x
0
y
0
1
(15)
This corresponds to a general invocation of R
h
with a
1
6= 0.
In the frequency domain (ignoring the effect of the translation parameter t, since it does not
affect aliasing), this corresponds to a transformation of
u
0
= A
T
u where A
T
=
a
0
0
a
1
1
. (16)
8
vertical
downsample
(a) (b) (c) (d)
vertical
upsample
horizontal
shear
x
y
u
v
x
1
y
1
u
1
v
1
x
2
y
2
u
2
v
2
x’
y’
u’
v’
Figure 4: Horizontal 3-pass shear: (a) original pixel grid, image, and its Fourier transform; (b)
vertical upsampling onto the blue lines; (c) horizontal shear onto the diagonal red lines; (d) final
vertical downsampling. The first row shows the sampling grids, the second row shows the images
being resampled, and the third row shows their corresponding spectra. The frequency spectra in
the third row are scaled to the unit square with maximum frequencies (±1, ±1) in order to make
the upsampling and downsampling operations more intuitive.
9
Thus, a horizontal shear in the spatial domain induces a vertical shear in the frequency domain,
which can lead to aliasing if we do not first upsample the signal vertically. (Notice how the original
frequency (±1, ±1) gets mapped to (±a
0
, ±1 ± a
1
), which can be beyond the vertical Nyquist
frequency.)
3
In order to avoid aliasing, we propose a three-pass algorithm, which consists of the following
steps:
1. upsample vertically by the factor r 1 + |a
1
|;
2. shear and scale horizontally, with filtering to avoid aliasing;
3. low-pass filter and downsample vertically.
In terms of geometric transformation, this corresponds to factoring
A =
a
0
a
1
0 1
=
1 0
0 1/r
a
0
a
1
/r
0 1
1 0
0 r
= A
1
A
2
A
3
, (17)
and applying the sequence of transformations
x = A
1
x
1
, x
1
= A
2
x
2
, x
2
= A
3
x
0
, (18)
as shown in Figure 4. The transpose of the middle matrix A
2
is
A
T
2
=
a
0
0
a
1
/r 1
, (19)
which, when multiplied by the maximum frequency present in the upsampled signal, (±1, ±
1
/
r
)
still lies inside the Nyquist range u
0
[1, +1]
2
(after horizontal filtering, which is applied during
the scale/shear).
In our operational notation, this can be written as
f
1
= R
v
(f, h, 1, 1/r, 0, 0); (20)
f
2
= R
h
(f
1
, h, max(1, |a
0
|), a0, a
1
/r, t); (21)
f
3
= R
v
(f
2
, h, r, r, 0, 0). (22)
10
vertical
filter
horizontal
shear
(a) (b) (c)
Figure 5: Horizontal 2-pass shear: (a) original Fourier transform; (b) vertical low-pass filtering;
(c) horizontal shear. Notice how some high vertical frequencies are lost.
An alternative to this three-step process is a sub-optimal two-pass algorithm:
1. vertically low-pass filter the image with a bandwidth 1/r;
2. shear, scale, (and pre-filter) the image horizontally.
As we can see in Figure 5c, this results in a loss of some high-frequency information, compared to
Figure 4d. (See the Experiments section for some examples.)
Unfortunately, the above derivation suggests that the upsampling rate can get arbitrarily high
as |a
1
| 1. In fact, the maximum upsampling rate need never exceed r = 3 (Figure 6). This is
because the pink portions of the spectrum along the left and right edge (high horizontal frequencies)
do not appear in the final image, and can therefore be filtered away during the horizontal shear.
To compute a better value for r, we first compute the maximum values of the original frequen-
cies that will appear in the final image, (u
max
, v
max
), as shown in Figure 6e. The value of v
max
can be less than 1 if we are considering the general form of a shear matrix with vertical scaling
included,
A =
a
00
a
01
0 a
11
=
1 0
0 a
11
/r
a
00
a
01
/r
0 1
1 0
0 r
, (23)
where we have combined the vertical scaling with the initial vertical resampling stage. To avoid
3
From here on we use the convention that the frequencies range over [1, +1], since this simplifies our notation.
11
vertical
upsample
(a) (b) (c)
vertical
downsample
(d)
u
max
horizontal
shear
u
max
(e)
v
max
Figure 6: Maximum vertical resampling rate: (a) original Fourier transform; (b) vertical upsam-
pling by a factor of 3; (c) horizontal shear and low-pass filtering horizontally; (d) final vertical
downsampling; (e) general case for computing u
max
and v
max
. Because horizontal frequencies
start being suppressed (moved to beyond the Nyquist frequency), it is not necessary to upsample
by more than a factor of 3.
aliasing, we must then ensure that
A
T
2
±u
max
±a
11
v
max
/r
=
±a
00
u
max
±a
01
u
max
/r ± a
11
v
max
/r
(24)
lies within the bounding box [1, +1]
2
, i.e., that |a
01
|u
max
/r + |a
11
|v
max
/r 1 or r |a
01
|u
max
+
|a
11
|v
max
. Whenever |a
11
|v
max
> 1, we can clamp this value to 1, since there is no need to further
upsample the signal. When the vertical scaling a
11
is sufficiently small (magnification), r < 1.
Since there is no risk of aliasing during the horizontal shear, we set r = 1 and drop the final
vertical downsampling stage. The formula for r thus becomes
r max(1, |a
01
|u
max
+ min(1, |a
11
|v
max
)). (25)
The final three (or two) stage resampling algorithm is therefore:
f
1
= R
v
(f, h, 1/v
max
, a
11
/r, 0, 0); (26)
f
2
= R
h
(f
1
, h, max(1, |a
00
|), a
00
, a
01
/r, t); (27)
f
3
= R
v
(f
2
, h, r, r, 0, 0), (28)
where the last stage is skipped if r = 1.
12
Figure 7: Shearing algorithm results: (a) tri-linear MIP-mapping with anisotropic filtering; (b)
EWA filtering, (c) sub-optimal vertical blur only algorithm; (d) optimal three-stage algorithm.
Please see our on-line web page http:// research.microsoft.com/ en-us/ um/redmond/ groups/ivm/
HQMPIR/ for more results as well as animations that better show the aliasing artifacts.
Figure 7 shows some examples of shears rendered using our optimal three-pass algorithm,
the sub-optimal (blurred) 2-pass shear algorithm, as well as a variety of previously developed
resampling algorithms.
6 General affine
It is well known that any 2D affine transform can be decomposed into two shear operations (Heck-
bert 1989, Wolberg 1990). For example, if we perform the horizontal shear first, we have
A =
a
00
a
01
t
0
a
10
a
11
t
1
0 0 1
=
b
0
b
1
t
2
0 1 0
0 0 1
1 0 0
a
10
a
11
t
1
0 0 1
, (29)
with
b
0
= a
00
a
01
a
10
/a
11
, b
1
= a
01
/a
11
, and t
2
= t
0
a
01
t
1
/a
11
. (30)
Notice that the above algorithm becomes degenerate as a
11
0, which is a symptom of the
bottleneck problem (Wolberg 1990). Fortunately, we can transpose the input (or output) image and
adjust the transform matrix accordingly.
13
To determine whether to transpose the image, we first re-scale the first two rows of A into unit
vectors,
ˆ
A =
ˆa
00
ˆa
01
ˆa
10
ˆa
11
=
a
00
/l
0
a
01
/l
0
a
10
/l
1
a
11
/l
1
, (31)
where l
i
=
q
a
2
i0
+ a
2
i1
. We then compute the absolute cosines of these vectors with the x and y
axes, |ˆa
00
| and |ˆa
11
|, and compare these to the absolute cosines with the transposed axes, i.e., |ˆa
01
|
and |ˆa
10
|. Whenever |ˆa
00
| + |ˆa
11
| < |ˆa
01
| + |ˆa
10
|, we transpose the image.
Having developed a three-pass transform for each of the two shears, we could concatenate these
to obtain a six-pass separable general affine transform. However, it turns out that we can collapse
some of the shears and subsequent upsampling or downsampling operations to obtain a four-pass
transform, as shown in Figure 8.
The trick is to perform the horizontal upsampling needed for later vertical shearing at the same
time as the original horizontal shear. In a similar vein, the vertical downsampling can be performed
in the same pass as the vertical shear and scale.
In terms of geometric transformations, this corresponds to a factorization of the form
A =
1 0
0 a
11
/r
v
b
0
a
01
/r
v
t
2
0 1 0
1/r
h
0
0 1
1 0
0 r
v
1 0 0
a
10
/(a
11
r
h
) 1 t
1
/a
11
r
h
0
0 1
=
1 0 0
0 a
11
/r
v
0
0 0 1
b
0
/r
h
a
01
/r
v
t
2
0 1 0
0 0 1
(32)
1 0 0
a
10
r
v
/(a
11
r
h
) r
v
t
1
r
v
/a
11
0 0 1
r
h
0 0
0 1 0
0 0 1
.
In order to compute the appropriate values for r
v
and r
h
, we must first determine which fre-
quency in the original image needs to be preserved in the final image, as shown in Figure 6e.
Frequencies that get mapped completely outside the final spectrum can be pre-filtered away during
the upsampling stages, thereby reducing the total number of samples generated. We compute the
14
vertical shear
+ downsample
(a) (b) (c) (d)
vertical
upsample
horizontal shear
+ upsample
horizontal
downsample
(e)
Figure 8: 4-pass rotation: (a) original pixel grid, image, and its Fourier transform; (b) vertical
upsampling; (c) horizontal shear and upsampling; (d) vertical shear and downsampling; (e) hori-
zontal downsampling. The general affine case looks similar except that the first two stages perform
general resampling.
15
(a) (b) (c) (d)
Figure 9: Two-dimensional chirp pattern affinely resampled using: (a) high-quality bicubic; (b)
trilinear MIP-map with anisotropic filtering; (c) EWA filter; (d) high quality four-pass rendering
(this paper). Please zoom in on the images to see more details.
values of (u
max
, v
max
) by intersecting the [1, +1]
2
square with the projection of the final spectrum
onto the original spectrum through A
T
(the dashed blue lines in Figure
˜
reffig:rotation).
Once we know the (u
max
, v
max
) extents, we can compute the upsampling rates using
r
v
max(1, |a
01
|u
max
+ min(1, |a
11
|v
max
)) and (33)
r
h
max(1, |a
10
/a
11
|r
v
v
max
+ min(1, |b
0
|u
max
)). (34)
The final four-step algorithm therefore consists of:
f
1
= R
v
(f, h, 1/v
max
, a
11
/r
v
, 0, t
1
); (35)
f
2
= R
h
(f
1
, h, 1/u
max
, b
0
/r
h
, a
01
/r
v
, t
2
); (36)
f
3
= R
h
(f
2
, h, r
v
, r
v
, a
10
r
v
/(a
11
r
h
), 0); (37)
f
4
= R
v
(f
3
, h, r
h
, r
h
, 0, 0). (38)
If mip-maps or ripmaps are being used (Section 4), the amount of initial downsampling can
be reduced by finding the smallest pyramid level that contains the frequency content needed to
reconstruct the warped signal and adjusting the matrix entries in A appropriately.
7 Experimental results
We have evaluated our algorithm on both synthetic and natural images. In our on-line web page
http://research.microsoft.com/en-us/um/redmond/groups/ivm/HQMPIR/, results are shown for pure
16
zoom (Figure 3), the three pass 1d shear (Figure 4), and two four pass cases - simultaneous rota-
tion plus zoom and tilted orthographic rotation (Figure 9). In each case we also show results for
the highest quality Windows GDI+ warp, EWA, and the NVidia 7800 GPU performing tri-linear
anisotropic texture filtering. Filtering artifacts are sometimes more apparent under animation, so
we provide animated examples of each of these transforms.
The synthetic image we use for most of our evaluation is a 2D chirp (Figure 9), since it contains
a broad spectrum of frequencies up to the Nyquist rate in both u and v. Using this pattern, excessive
low-pass filtering appears as a dimming of the chirp and aliasing appears as a Moir
´
e pattern. The
equation we used to generate the chirp is:
f(x, y) = cos(2πkx2
x
) · cos(2πky2
y
) (39)
In our evaluation we show the transformed images as well as the result of applying the trans-
form and then applying the inverse transform to warp back to the original pixel grid. For animated
results, this round-trip transform is useful because the eye is not distracted by the motion of the
image and can focus on the blur and aliasing artifacts (Dodgson 1992).
We have also evaluated the results on the natural images shown in Figure 10. For these, results
we refer users to the supplementary web page, where the ability to compare images and the ani-
mations show the benefits of our algorithm. The greatest aliasing effects can be seen for diagonal
frequencies in the GDI+ and GPU anisotropic filters, and this can be seen as Moir
´
e patterns in
areas of fine detail. The balance between aliasing and blurring can be set subjectively for EWA
by altering the α parameter, but one or both are always present. When viewing our results, notice
that our algorithm has far less aliasing, evident by the lack of Moir
´
e in both the chirp results and
the picket fence of the lighthouse image. Other algorithms also display more aliasing in the wheel
areas of the bikes image. Fortunately, the lack of aliasing in our algorithm does not come at the ex-
pense of sharpness, since we maintain high frequency details where other techniques have blurred
them out.
17
Figure 10: (a) Lighthouse test image; (b) Bikes test image.
8 Extensions and future work
While this paper has developed the basic theory for multi-pass non-aliasing affine transforms, it
also suggests a number of promising directions for future research. These include optimizing the
order of shears and scalings, rendering triangle meshes, memory optimizations, and full perspective
warping.
Order of shears In this paper, we have implemented the general affine warp as a horizontal shear
followed by a vertical shear (with additional transpose and/or resampling stages, as needed). In
theory, the order of the shears should not matter if perfect filtering is used. In practice, non-ideal
filters may induce a preferred ordering, as might computational considerations. For example, we
may want to defer large amounts of magnification until later stages in the pipeline.
Filter selection and optimization In our current implementation, we have used a 4 cycle (9-
tap) windowed sinc filter, since it provides a reasonable tradeoff between quality and efficiency.
Using more taps (up to 6 or so) results in slight reductions in aliasing, while using fewer results in
significant degradation. It would be worthwhile to investigate alternative filters, especially those
18
designed to reduce ringing while preserving high frequency content. Dodgson (1992) contains a
nice discussion of non-linear filters that might be appropriate.
Triangle mesh rendering In this paper, we haven’t said anything about the sizes of the interme-
diate images required during the intermediate stages. If we generalize the final image to a triangle
or polygon, we can warp its shape back through successive stages and also add required additional
pixels around the boundaries to ensure that each filter has sufficient support. Only the pixels inside
each support region would then have to be resampled.
A more interesting question is whether triangle meshes could be rendered by re-using warped
and filtered samples from adjacent triangles. This is a subtle issue, since if there are large discon-
tinuities in the local affine transforms across triangle edges, visible errors might be induced.
Tiling and pipelining While GPUs are relatively insensitive to the order in which texture mem-
ory pixels are fetched (so long as pixels are re-used several times during computation), the same
is not true for regular CPU memory. Optimizing memory accesses by breaking up the image into
smaller 2D tiles and potentially pipelining the computation may results in significant speedups.
Perspective The family of multipass scanline algorithms such as (Catmull and Smith 1980) on
which our work is based includes transforms such as perspective. We have not yet fully devel-
oped the theory of optimal multi-pass perspective algorithms because achieving full computational
efficiency is tricky.
Perspective resampling is usually implemented by locally computing an affine approximation
to the full transform and using its parameters to control the amount of filtering (Greene and Heck-
bert 1986, Wolberg 1990, McCormack et al. 1999). We could easily take Catmull and Smith’s
original 2-pass perspective transform and replace each stage with an optimal per-pixel polyphase
filter. (The filter bandwidth would vary spatially.) The difficulty lies in computing the amount
of upsampling that needs to be applied before each one-dimensional (perspective) shearing stage.
Since this quantity is non-linear in the affine parameters because of the absolute value, there is
no rational linear formula that would locally determine the amount of upsampling required, which
19
leads to a more complex algorithm. We could always just upsample each image/stage by the theo-
retical maximum of r = 3; instead we leave the development of the full perspective case to future
work.
9 Conclusions
In this paper, we have developed a 4-stage scanline algorithm for affine image warping and re-
sampling. Our algorithm uses optimal one-dimensional filtering at each stage to ensure that the
image is neither excessively blurred nor aliased, which is not the case for previously developed al-
gorithms. Because each stage only uses one-dimensional filters, the overall computation efficiency
is very good, being amenable to GPU implementation using pixel shaders. While our algorithm
may not be suitable for some applications such as high polygon complexity scenes, we believe that
it forms the basis of a new family of higher quality resampling and texture mapping algorithms
with wide applicability to scenes that require high visual fidelity.
References
Akenine-M
¨
oller, T. and Haines, E. (2002). Real-Time Rendering. A K Peters, Wellesley, Mas-
sachusetts, second edition.
Barkans, A. C. (1997). High quality rendering using the Talisman architecture. In Proceedings
of the Eurographics Workshop on Graphics Hardware.
Betrisey, C. et al.. (2000). Displaced filtering for patterned displays. In Society for Information
Display Symposium,, pages 296–299.
Catmull, E. and Smith, A. R. (1980). 3-d transformations of images in scanline order. Computer
Graphics (SIGGRAPH’80), 14(3), 279–285.
Dodgson, N. A. (1992). Image Resampling. Technical Report TR261, Wolfson College and
Computer Laboratory, University of Cambridge.
20
Ewins, J. et al.. (1998). Mip-map level selection for texture mapping. IEEE Transactions on
Visualization and Computer Graphics, 4(4), 317–329.
Greene, N. and Heckbert, P. (1986). Creating raster Omnimax images from multiple perspective
views using the elliptical weighted average filter. IEEE Computer Graphics and Applications,
6(6), 21–27.
Heckbert, P. (1986). Survey of texture mapping. IEEE Computer Graphics and Applications,
6(11), 56–67.
Heckbert, P. (1989). Fundamentals of Texture Mapping and Image Warping. Master’s thesis,
The University of California at Berkeley.
H
¨
uttner, T. and Straßer, W. (1999). Fast footprint MIPmapping. In 1999 SIGGRAPH / Euro-
graphics Workshop on Graphics Hardware, pages 35–44.
McCormack, J., Perry, R., Farkas, K. I., and Jouppi, N. P. (1999). Feline: Fast elliptical lines for
anisotropic texture mapping. In Proceedings of SIGGRAPH 99, pages 243–250.
Mitchell, D. P. and Netravali, A. N. (1988). Reconstruction filters in computer graphics. Com-
puter Graphics (Proceedings of SIGGRAPH 88), 22(4), 221–228.
Oppenheim, A. V., Schafer, R. W., and Buck, J. R. (1999). Discrete-Time Signal Processing.
Prentice Hall, Englewood Cliffs, New Jersey, 2nd edition.
Schilling, A., Knittel, G., and Straßer, W. (1996). Texram: A smart memory for texturing. IEEE
Computer Graphics & Applications, 16(3), 32–41.
Smith, A. R. (1981). Digital Filtering Tutorial for Computer Graphics. Technical
Memo 27, Computer Graphics Project, Lucasfilm Ltd. Revised Mar 1983, available on
http://alvyray.com/Memos/MemosPixar.htm.
Triggs, B. (2001). Empirical filter estimation for subpixel interpolation and matching. In Eighth
International Conference on Computer Vision (ICCV 2001), pages 550–557, Vancouver, Canada.
21
Unser, M. (1999). Splines: A perfect fit for signal and image processing. IEEE Signal Processing
Magazine, 16(6), 22–38.
Williams, L. (1983). Pyramidal parametrics. Computer Graphics, 17(3), 1–11.
Wolberg, G. (1990). Digital Image Warping. IEEE Computer Society Press, Los Alamitos.
22
... Artifacts arising from signal interpolation and resampling have been initially studied outside the context of registration [3,4,11,12], and further exploited in exposing digital forgeries [13]. Ref. [4] clearly distinguishes resampling from interpolation, and is the first paper that we are aware of to mention that resampling aliases the higher frequencies of the non-ideal interpolant into the lower frequencies (see Sec. III), a concept which was subsequently restated by the authors of [14,15]. We will see in Sec. ...
... Low-pass filtering the image has been suggested to eliminate the resampling artifacts [41,44] (outside the registration context), and similar ideas have been employed in the steerable pyramids framework [45][46][47] as well. Multi-pass rotation algorithms, such as those decomposing the rotation into sequences of 1D translations through shearing [15,[48][49][50][51][52][53][54], have been shown to introduce even further aliasing in the image [42][43][44] (although negligible [43]). ...
... Nevertheless, in contrast to their low computational cost, such non-ideal kernels have two undesirable properties: attenuation of low-frequency components which are supposed to remain intact (' = in Fig. 1c), and leaving behind some high-frequency components which are supposed to be removed (' in Fig. 1c). The latter is the major cause of the resampling artifacts in general, as explained here and in [4,14,15]. ...
Article
Full-text available
We show that image registration using conventional interpolation and summation approximations of continuous integrals can generally fail due to resampling artifacts. These artifacts negatively affect the accuracy of registration by producing local optima, altering the gradient, shifting the global optimum, and making rigid registration asymmetric. In this work, after an extensive literature review, we demonstrate the causes of the artifacts by comparing inclusion and avoidance of resampling analytically. We show the sum-of-squared-differences cost function formulated as an integral to be more accurate compared to its traditional sum form in a simple case of image registration. We then discuss aliasing that occurs in rotation, which is due to the fact that an image represented in the Cartesian grid is sampled with different rates in different directions, and propose the use of oscillatory isotropic interpolation kernels that allow better recovery of true global optima by overcoming this type of aliasing. Through our experiments on brain, fingerprint, and white noise images, we illustrate the superior performance of the integral registration cost function in both Cartesian and spherical coordinates, and also validate the introduced radial interpolation kernel by demonstrating the improvement in registration.
... In the following we briefly summarize the image resampling process, for a thorough derivation we refer to literature (e.g. [18,21,22]). Consider an input image with pixel values w k , where k is the linearized image coordinate and u k the corresponding 2D coordinate in source (image) space. ...
... Our observations are in line with claims and evaluations from previous works [1,10,18,20]. Quality-wise, we conclude that the adaptive EWA splatting approach is superior both to simple bilinear interpolation and bilinear interpolation with super-sampling. In terms of quality, a bicubic or windowed-sinc backward mapping approach with (sufficient) supersampling are superior to EWA splatting. ...
Conference Paper
Full-text available
Spatially-varying, non-linear image warping has gained growing interest due to the appearance of image domain warping applications such as aspect ratio retargeting or stereo remapping/stereo-to-multiview conversion. In contrast to the more common global image warping, e.g., zoom or rotation, the image transformation is now a spatially-varying mapping that, in principle, enables arbitrary image transformations. A practical constraint is that transformed pixels keep their relative ordering, i.e., there are no fold-overs. In this work, we analyze and compare spatially-varying image warping techniques in terms of quality and computational performance. In particular, aliasing artifacts, interpolation quality (sharpness), number of arithmetical operations, and memory bandwidth requirements are considered. Further, we provide an architecture based on Gaussian filtering and an architecture with bicubic interpolation and compare corresponding VLSI implementations.
... The resulting filter is space variant in general due to the space variant mapping. In many applications, Gaussian filters are unpopular as they introduce significant blur [95]. Greisen et al. [27] present an approach based on the work of Heckbert that reduces the blurring by careful adjustment of the filter parameters. ...
Thesis
Full-text available
Multi-camera systems are being deployed in a variety of vehicles and mobile robots today. Applications of such systems range from driver assistance functions such as rendering a virtual panoramic view to surround sensing, which is a prerequisite for partially and fully automated driving. In order to derive metric quantities such as angles and distances from camera images and to establish a consistent representation of the vehicle environment, both, the geometric imaging characteristics of the individual cameras and the relative positions and orientations have to be known. In particular the estimation of the relative positions and orientations, which are described by the extrinsic calibration, is troublesome since it can only be performed with the system being fully set up and since non-negligible changes of the parameters have to be expected over the life cycle of the vehicle due environmental influences. To eliminate the need for cost and labor intensive maintenance, continuous self-calibration is highly desirable. Self-calibration typically builds upon overlapping field of views of cameras, which enables estimating the extrinsic calibration parameters using image correspondences. Motion-based calibration on the other hand does not impose constraints on the fields of view. However, the almost planar motion of typical road vehicles constitutes a special case in which only a subset of the calibration parameters can be inferred. To circumvent this problem additional constraints can be imposed, e.g. by using the ground plane as a natural reference object. In a theoretical analysis we determine the sets of parameters that can be inferred from different vehicle motion classes and camera configurations. For visual surround sensing typically cameras with ultra-wide angle lenses, such as fisheye lenses, are employed. In order to establish image correspondences in the presence of strong geometrical distortions introduced by the lens and large viewpoint variations we propose an image warping method that exploits the knowledge about the geometric imaging process and performs a coarse scene approximation. In addition, we present a method for tracking the ground plane in the presence of structural outliers such as other planes in the scene. Building upon the observability analysis and proposed methods we present an extended Kalman filter-based algorithm for continuous extrinsic camera self-calibration. The filter exhibits high flexibility with regard to incorporating different measurement constraints, has a particularly low number of internal parameters, and relies solely on image data. In an extensive evaluation we assess our algorithm quantitatively using real-world data. We compare results based on different motion models and combinations of measurement constraints against a reference calibration. It is found that the best results are obtained by combining all of the proposed measurement constraints. Using several examples we demonstrate that the achieved accuracy is sufficient for most applications.
... To avoid aliasing, an additional anti-aliasing filter is applied in target space. We refer to [12] or [13] for more details on general image resampling or image warping. The derivation of EWA splatting is given in [14] or [6]. ...
Conference Paper
Full-text available
Current digital video pipelines are progressing towards ever higher image resolutions and frame rates, a trend which increases computational requirements of mobile end-user devices. Moreover, due to the wide variety of devices with display sub-systems, video streams often need to be adapted to the capabilities of the respective platform. In this work, we present a rendering core that is able to perform spatially-varying geometrical transforms with implicit anti-aliasing in real-time on high-definition video. The rendering is realized with a high-quality elliptical weighted average (EWA) splatting algorithm. The ASIC implementation is fabricated in a 130nm CMOS technology, and is equipped with a standard display interface and a QDRII RAM interface. The ASIC achieves at least 1080p30 (full HD) video I/O, and is able to perform per-pixel transformation on the video stream in real-time and at low latency.
Article
In the fingerprint recognition system the image quality has a high impact on the performance of the system. The presence of noise may affect the recognition rate considerably. In this paper we propose a novel preprocessing technique called O-F estimation in curved Gabor Filters for the fingerprint image quality enhancement. The orientation field is computed by two components, namely singular orientation and residual orientation, which are used for extracting the feature vector of the recognized fingerprint image. The proposed work is implemented using FVC2004 database which provides an acceptable improvement rate compared to the state of the art methods. Thus, False acceptance rate is reduced by 0.25% and False Rejected Rate is increased by 0.5%.
Article
Full-text available
Non-linear image warping or image resampling is a necessary step in many current and upcoming video applications such as video retargeting, stereoscopic 3D mapping, and multi-view synthesis. The challenges for real-time resampling include image quality but also available energy and computational power of the employed device. In this work, we employ an elliptical-weighted average (EWA) rendering approach to 2D image resampling. We extend the classical EWA framework for increased visual quality and provide a VLSI architecture for efficient view rendering. The resulting architecture is able to render high-quality video sequences in real-time targeted for low-power applications in end-user display devices.
Article
Full-text available
Presented as tutorial notes at the 1983 SIGGRAPH, July 1983, and the 1984 SIGGRAPH, July 1984. This document was reentered in Microsoft Word on 8 Feb 2000. Spelling and punctuation are generally preserved, but trivially minor spelling errors are corrected. Otherwise additions or changes made to the original are noted inside square brackets or in footnotes. Digital sampling and filtering in both space and time are intrinsic to comput-er graphics. The pixels of a framebuffer representation of a picture are regularly placed samples in 2-dimensional screen space. Inappropriate application of sam-pling theory (or no application at all) results in the artifact called "jaggies". The frames of a film representation of a movement are regularly placed samples in time. The artifact here is called "strobing". Spatial filtering, called "antialiasing", is used to soften the jaggies. Temporal filtering, called "motion blur", removes strobing of edges and backward spinning stagecoach wheels. The purpose of this memo is to review the principles of digital sampling and filtering theory in the context of computer graphics. In particular, it is a reword-ing of the classical results in terms with which I am comfortable. Hopefully other computer graphicists will also find the restatement helpful. We will deal here only with the spatial case. The two examples studied are scaling a picture up (magnification) and scaling a picture down (minification). We shall be especially concerned with what happens as magnification becomes mini-fication—as the scale factor passes through 1. We begin with well-known theo-rems and derive the principal result: four equivalent statements of the minifica-tion process and four equivalent statements of the magnification process. Everyone seems to believe that sampling theory is simple, elegant, and straightforward, but it is my experience that whenever two computer graphicists try to discuss the subject they end up quibbling over details, sacrificing intuition in the impossible effort to convince one another that each knows the important details of the basic theory. I think this is because there are several (at least four) ways to look at a sampling process, all equivalent, and each combatant is fluent in only one or two of these ways and never the same ones as his opponent. Not only do I hope to offer a set of wordings to form a basis for comfortable conversation but also a set of equivalent intuitions among which we can pick and choose depending on the particular constraints of the hardware of software de-sign problem at hand.
Article
Full-text available
Currerntly texture mapping onto projections of 3-D surfaces is time consuming and subject to considerable aliasing errors. Usually the procedure is to perform some inverse mapping from the area of the pixel onto the surface texture. It is difficult to do this correctly. There is an alternate approach where the texture surface is transformed as a 2-D image until it conforms to a projection of a polygon placed arbitrarily in 3-space. The great advantage of this approach is that the 2-D transformation can be decomposed into two simple transforms, one in horizontal and the other in vertical scanline order. horizontal scanline order, Sophisticated light calculation is also time consuming and difficult to calculate correctly on projected polygons. Instead of calculating the lighting based on the position of the polygon, lights, and eye, the lights and eye can be transformed to a corresponding position for a unit square which we can consider to be a canonical polygon. After this canonical polygon is correctly textured and shaded it can be easily conformed to the projection of the 3-D surface.
Conference Paper
Full-text available
T exture m apping usi ng trili nearly filtered mip-mapped da ta is e fficient and looks much better than point-sampled or bilinearly f iltered data. But trilinear filtering represents the projection of a p ixel filter footprint from screen space into texture space as a square, when in reality the footprint may be long and narrow. Conse quently, trilinear filtering severely blurs images on surfaces angled obliquely away from the viewer. T his paper describes a new texture filtering technique called Fe line (f or Fa st E lli ptical Lines). Like other recent hardware anisotropic filtering algorithms, Feline uses an underlying space-inv ariant (isotropic) filter with mip-mapped data, and so can be built on top of an existing trilinear filtering engine. To texture a p ixel, it uses this space-invariant filter at several points along a l ine in texture space, and combines the results. With a modest inc rease in im plementation c omplexity ove r e arlier te chniques, F eline more accurately matches the desired projection of the pixel f ilter in texture space, resulting in images with fewer aliasing artifacts. Feline's visual quality compares well against Elliptical W eighted Average, the best software anisotropic texture filtering algorithm known to date, but Feline requires much less setup c omputation and far fewer cycles for texel fetches. Finally, since it us es standard mip-maps, Feline requires minimal extensions to standard 3D interfaces like OpenGL. CR Categories and Subject Descriptors: I. 3.1 [Computer G raphics]: Hardware Architecture-Graphics processors; I. 3.7 [Computer Graphics]: Three-dimensional Graphics and Realism-C olor, shading, shadowing, and texture
Article
An abstract is not available.
Article
The mapping of images onto surfaces may substantially increase the realism and information content of computer-generated imagery. The projection of a flat source image onto a curved surface may involve sampling difficulties, however, which are compounded as the view of the surface changes. As the projected scale of the surface increases, interpolation between the original samples of the source image is necessary; as the scale is reduced, approximation of multiple samples in the source is required. Thus a constantly changing sampling window of view-dependent shape must traverse the source image. To reduce the computation implied by these requirements, a set of prefiltered source images may be created. This approach can be applied to particular advantage in animation, where a large number of frames using the same source image must be generated. This paper advances a “pyramidal parametric” prefiltering and sampling geometry which minimizes aliasing effects and assures continuity within and between target images. Although the mapping of texture onto surfaces is an excellent example of the process and provided the original motivation for its development, pyramidal parametric data structures admit of wider application. The aliasing of not only surface texture, but also highlights and even the surface representations themselves, may be minimized by pyramidal parametric means.
Conference Paper
Problems of signal processing arise in image synthesis because of transformations between continuous and discrete representations of 2D images. Aliasing introduced by sampling has received much attention in graphics, but reconstruction of samples into a continuous representation can also cause aliasing as well as other defects in image quality. The problem of designing a filter for use on images is discussed, and a new family of piecewise cubic filters are investigated as a practical demonstration. Two interesting cubic filters are found, one having good antialiasing properties and the other having good image-quality properties. It is also shown that reconstruction using derivative as well as amplitude values can greatly reduce aliasing.
Article
This best-selling, original text focuses on image reconstruction, real-time texture mapping, separable algorithms, two-pass transforms, mesh warping, and special effects. The text, containing all original material, begins with the history of the field and continues with a review of common terminology, mathematical preliminaries, and digital image acquisition. Later chapters discuss equations for spatial information, interpolation kernels, filtering problems, and fast-warping techniques based on scanline algorithms.