Content uploaded by Simon Winder
Author content
All content in this area was uploaded by Simon Winder on Dec 02, 2015
Content may be subject to copyright.
Highquality multipass image resampling
Richard Szeliski, Simon Winder, and Matt Uyttendaele
February 2010
Technical Report
MSRTR201010
This paper develops a family of multipass image resampling algorithms that use
onedimensional ﬁltering stages to achieve highquality results at low computational
cost. Our key insight is to perform a frequencydomain analysis to ensure that very
little aliasing occurs at each stage in the multipass transform and to insert additional
stages where necessary to ensure this. Using onedimensional resampling enables
the use of small resampling kernels, thus producing highly efﬁcient algorithms. We
compare our results with other state of the art software and hardware resampling algo
rithms.
Microsoft Research
Microsoft Corporation
One Microsoft Way
Redmond, WA 98052
http://www.research.microsoft.com
1 Introduction
Current texture mapping hardware normally uses MIPmapping (Williams 1983), sometimes com
bined with multiplesample anisotropic ﬁltering (Barkans 1997). Unfortunately, these algorithms
sometimes produce considerable aliasing in areas of highfrequency content. Better ﬁlters, such as
Elliptical Weighted Average (EWA) ﬁlter (Greene and Heckbert 1986)), have been developed, but
even these produce visible artifacts or excessive blurring when textures are animated. While the
theory of highquality image image ﬁltering is well known (Mitchell and Netravali 1988, Heckbert
1989, Wolberg 1990), it is usually applied only to (separable) image rescaling, with suboptimal
texturemapping algorithms being used in other cases.
Today’s CPU multimedia accelerators and GPUs have more than enough power to support
better resampling algorithms, especially for applications where visual quality is important, such as
photo manipulation, animated slideshows, panoramic image and map viewing, and visual effects.
What is missing are algorithms that can efﬁciently ﬁlter away highfrequency content that might
cause aliasing while simultaneously preserving important texture details.
In this paper, we develop a family of highquality multipass texture mapping algorithms, which
use a series of onedimensional ﬁltering stages to achieve good efﬁciency while maintaining high
visual ﬁdelity. The key to our approach is to use Fourier analysis to ensure that none of the stages
performs excessive blurring or aliasing, so that the resulting resampled signal contains as much
highfrequency detail as possible while avoiding aliasing. Figures 4 and 8 show the basic idea.
The image is ﬁrst upsampled to prevent aliasing during subsequent shearing steps, and then down
sampled to its ﬁnal size with highquality lowpass ﬁltering. While this paper focuses on the
case of afﬁne transforms, the basic approach can be extended to full perspective, as discussed in
Section 8.
2 Previous work
Image resampling (a.k.a. image warping or texture mapping) algorithms have been under active
development for almost three decades, and several good surveys and textooks on these subjects
1
can be found (Heckbert 1989, Wolberg 1990, Dodgson 1992, AkenineM
¨
oller and Haines 2002).
These algorithms fall roughly into four distinct categories: image ﬁltering, multipass transforms,
pyramidal MIPmapping, and elliptical weighted averaging.
Image ﬁltering algorithms focus on optimizing the shape of interpolation and/or lowpass ﬁl
ters to minimize a number of competing visual artifacts, such as aliasing, ringing, and blurring.
Mitchell and Netravali (1988) introduce a taxonomy for these artifacts, and also design a cubic
reconstruction ﬁlter that heuristically optimizes some of these parameters. (Dodgson(1992) has
a more extended discussion of visual criteria.) Monographs and surveys on image resampling
and warping such as (Heckbert 1989, Wolberg 1990, Dodgson 1992, AkenineM
¨
oller and Haines
2002) have nice tutorial sections on image ﬁltering and reconstruction, as do classic image process
ing books (Oppenheim et al. 1999) and research papers in image processing and computer vision
(Unser 1999, Triggs 2001). The ﬁrst four books also cover the topic of geometric transforms which
underlie afﬁne (and more general) image warping.
Multipass or scanline algorithms use multiple onedimensional image rescaling and/or shear
ing passes, combined with ﬁltering of varying quality, to implement image rotations and other
afﬁne or nonlinear image transforms. Heckbert (1989) and Wolberg (1990) both have nice re
views of these algorithms, including the seminal twopass transform developed by Catmull and
Smith (1980). Unfortunately, none of these techniques use highquality (multitap) image ﬁlters
inside their onedimensional resampling stages, nor do they account for aliasing in the orthogonal
dimension during shearing (see Section 5). It is somewhat surprising that no one has so far merged
high quality image ﬁltering with scanline algorithms, which is what this paper aims to do.
MIPmapping algorithms (Williams 1983) construct an image pyramid ahead of time, which
make subsequent downsampling operations more efﬁcient by avoiding the need for large lowpass
ﬁlter kernels. In its usual form, trilinear ﬁltering is used. First, the two nearest pyramid levels are
found, using a heuristic rule based on the local afﬁne warp being performed (Ewins et al. 1998).
The results of bilinearly interpolating each of these two images is then linearly blended. Unfortu
nately, the bilinear resampling introduces aliasing, while blending imagery from the coarser level
introduces additional blur. Using (4 × 4) bicubic interpolation has been proposed, but according
2
H
2
i
f
xxxi
f’g
1
g
2
g
3
u
F
u
G
1
u
G
2
u
G
3
u
F’
H
1
interpolate
* h
1
(x)
warp
ax+t
filter
* h
2
(x)
sample
* δ(x)
(f) (g) (h) (i) (j)
(a) (b) (c) (d) (e)
Figure 1: Onedimensional signal resampling: (a) original sampled signal f(i); (b) interpolated
signal g
1
(x); (c) warped signal g
2
(x); (d) ﬁltered signal g
3
(x); (e) sampled signal f
0
(i). The
corresponding spectra are shown below the signals in ﬁgures (f–j), with the aliased portions shown
in red.
.
to AkenineM
¨
oller and Haines (2002), this option is not widely available in hardware.
The performance of MIPmapping degrades even further when the resampling becomes aniso
tropic. Ripmapping (AkenineM
¨
oller and Haines 2002) extends the idea of the pyramidal MIPmap
by creating rectangular smaller images as well. While this requires a 300% memory overhead (as
opposed to only 30% for MIPmaps), it produces betterquality results when images are zoomed
anisotropically (changes in aspect ratios). For general skewed anisotropy, a variety of multisample
anisotropic ﬁlters have been proposed (Schilling et al. 1996, Barkans 1997). While these offer a
noticeable improvement over regular MIPmapping in heavily foreshortened regions, they still
suffer from the aliasing introduced by lowquality trilinear ﬁlters (see our Experimental Results
section).
Finally, Elliptic Weighted Average (EWA) ﬁlters convolve the image directly with a non
separable oriented (skewed) Gaussian ﬁlter (Greene and Heckbert 1986). While this has the rep
utation in some quarters of producing high quality results (AkenineM
¨
oller and Haines 2002),
Gaussian ﬁltering is known to simultaneously produce both aliasing and blurring. Since the ﬁlter
3
is isotropic in the warped coordinates, it incorrectly ﬁlters out corner frequencies in the spectrum
and, being nonseparable, the na
¨
ıve implementation of EWA is also quite slow, although faster
algortihms based on MIPmapping have recently been proposed (McCormack et al. 1999, H
¨
uttner
and Straßer 1999).
The remainder of the paper is structured as follows. Sections 3 and 4 review the basics of
onedimensional and (separable) twodimensional image resampling. Section 5 presents our novel
threepass optimal resampling algorithm for performing onedimensional image shears, while Sec
tion 6 builds on these results to develop an efﬁcient fourpass general afﬁne resampling algorithm.
Section 7 uses a variety of test images and motions to compare our algorithm to previously de
veloped state of the art resampling algortihms. We close with a discussion of future directions for
research that this work suggests.
3 Onedimensional resampling
Before we describe our new algorithms, we ﬁrst brieﬂy review the theory of optimal onedimensional
signal resampling. We use the framework shown in Figure 1, which Heckbert (1989) calls ideal
resampling and Dodgson (1992) calls the four part decomposition (and both attribute to (Smith
1981)).
The original source image (or texture map) is a sampled signal f (i), as shown in Figure 1a.
Because the signal is sampled, its Fourier transform is inﬁnitely replicated along the frequency
axis, as shown in Figure 1f.
To resample the signal, we ﬁrst (conceptually) convert it into a continuous signal g(x) by
convolving it with an interpolation ﬁlter, h
1
(x),
g
1
(x) =
X
i
f(i)h
1
(x − i), (1)
as shown in Figure 1b. In the frequency domain, this corresponds to multiplying the original signal
spectrum F (u) = F{f(x)}, with the spectrum of the interpolation ﬁlter H
1
(u) to obtain
G
1
(u) = F (u)H
1
(u), (2)
4
as shown in Figure 1f–g.
If the ﬁlter is of insufﬁcient quality, phantom replicas of the original spectrum persist in higher
frequencies, as shown in red in Figure 1g. These replicas correspond to the aliasing introduced
during the interpolation process, and are often visible as unpleasant discontinuities (jaggies) or
motion artifacts (crawl).
Examples of interpolation ﬁlters include linear interpolation, cubic interpolation (Mitchell and
Netravali 1988), and windowed sinc interpolation (Oppenheim et al. 1999). A complete discussion
of the merits of various onedimensional interpolation ﬁlters is beyond the scope of this paper, since
they have been widely studied in the ﬁelds of signal processing (Oppenheim et al. 1999, Wolberg
1990, Dodgson 1992), image processing (Unser 1999, Triggs 2001) and graphics (Heckbert 1986).
In this paper, we use a raised cosineweighted sinc ﬁlter with 4 cycles (9 taps when interpolating).
The next step is to apply a spatial transformation to the original signal domain, e.g.,
x = ax
0
+ t, (3)
which is an afﬁne spatial warp. (Other transformation, such as perspective or arbitrary warps are
also possible (Heckbert 1989, Wolberg 1990).) Note how we always specify the inverse warp, i.e.,
the mapping from ﬁnal pixel coordinates x
0
to original coordinates x.
The warped or transformed continuous signal and its Fourier transform (in the afﬁne case) are
g
2
(x
0
) = g
1
(ax
0
+ t) ⇔ G
2
(u) =
1
a
G
1
(u/a)e
jut/a
. (4)
If the original signal is being compressed (Figure 1c), the Fourier transform becomes dilated
(stretched) along the frequency axis (Figure 1h).
Before resampling the warped signal, we preﬁlter (lowpass ﬁlter) it by convolving it with
another kernel,
g
3
(x) = g
2
(x) ∗ h
2
(x) ⇔ G
3
(u) = G
2
(u)H
2
(u) (5)
(Figure 1d/i). This is particularly necessary if the signal is being miniﬁed or decimated, i.e., if a > 1
in (3). If this ﬁltering is not performed carefully, some additional aliasing may be introduced into
the ﬁnal sampled signal (Figure 1j).
5
ai+ta(i–1)+t
Figure 2: Polyphase ﬁltering. The coefﬁcients used from h(x) for the black sample point at ai + t
and the red sample point at a(i − 1) + t are different, and would be stored in different phases of
the twodimensional polyphase lookup table h
P
(k; φ).
Fortunately, because of the linearity of convolution operators, the three stages of ﬁltering and
warping can be combined into a single composite ﬁlter
H
3
(u) = H
1
(u/a)H
2
(u), (6)
which is often just a scaled version of the original interpolation ﬁlter h
1
(x).
1
The ﬁnal discrete
convolution can be written as
f
0
(i) = g
3
(i) =
X
j
h
3
(ai + t − j)f(j) =
X
j
h([ai + t − j]/s)f(j), (7)
where s = max(1, a).
The ﬁlter in (7) is a polyphase ﬁlter, since the ﬁlter coefﬁcients being multiplied with the input
signal f(j) are potentially different for every value of i (Figure 2). To see this, we can rewrite (7)
as
f
0
(i) =
X
j
h([ai + t − j]/s)f(j) =
X
j
h
P
(j
∗
− j; φ)f(j), (8)
where
j
∗
= bai + tc, (9)
φ = ai + t − j
∗
, and (10)
h
P
(k; φ) = h([k + φ]/s). (11)
1
For ideal (sinc) reconstruction and lowpass ﬁltering, the Fourier transform is a box ﬁlter of the smaller width,
and hence the combined ﬁlter is itself a sinc (of larger width).
6
The values in h(k; φ) can be precomputed for a given value of s and stored in a twodimensional
lookup table.
2
The number of discrete fractional values of φ that need to be stored is related to the
desired precision of the convolution, and is typically 2
b
, where b is the number of bits of desired
precision in the output (say 10bits for 8bit RGB images to avoid error accumulation in multipass
transforms).
We can write the above formula (7) in a functional form
f
0
= R(f; h, s, a, t). (12)
In other words, R is an algorithm, parameterized by a continuous ﬁlter kernel h(x), scale factors s
and a, and a translation t, which takes an input signal f (i) and produces an output signal f
0
(i). This
operator is generalized in the next section to a pair of horizontal and vertical scale/shear operators,
which are the basic building blocks for all subsequent algorithms.
4 Twodimensional zooming
In this section, we review twopass separable transforms, which can be accomplished by ﬁrst
resampling the image horizontally and then resampling the resulting image vertically (or vice
versa). We can perform these operations by extending our onedimensional resampling operator
(12) to a pair of horizontal and vertical image resampling operators,
f
0
= R
h
(f, h, s, a
0
, a
1
, t) ⇔ (13)
f
0
(i, j) =
X
k
h(s[a
0
i + a
1
j + t − k])f(k, j) and
f
0
= R
v
(f, h, s, a
0
, a
1
, t) ⇔ (14)
f
0
(i, j) =
X
k
h(s[a
1
i + a
0
j + t − k])f(i, k).
Note that these operators not only support directional scaling and translation, but also support
shearing (using a different translation for each row or column), which is used for more complex
transformations.
2
The values of h(k; φ) should be renormalized so that
P
k
h(k; φ) = 1.
7
Figure 3: Image magniﬁcation results: (a) original chirp image; (b) trilinear MIPmapping;
(c) EWA ﬁltering; (d) windowed sinc function. Notice how the other techniques produce either
excessive blur or aliasing. (Please look at these results by magnifying your document viewer.)
Image miniﬁcation (zooming out) can be made more efﬁcient using MIPmaps (Williams 1983)
or ripmaps (AkenineM
¨
oller and Haines 2002), as described in Section 2.
Figure 3 shows some examples of image magniﬁcation using trilinear MIPmapping, EWA
ﬁltering, and windowed sinc lowpass ﬁltering. Note how the windowed sinc function produces
the least aliasing and blur.
5 Shear
In order to better explain our general afﬁne multipass resampling algorithm, we start with the
simpler case of a pure horizontal shear,
x
y
=
a
0
a
1
t
0 1 0
x
0
y
0
1
(15)
This corresponds to a general invocation of R
h
with a
1
6= 0.
In the frequency domain (ignoring the effect of the translation parameter t, since it does not
affect aliasing), this corresponds to a transformation of
u
0
= A
T
u where A
T
=
a
0
0
a
1
1
. (16)
8
vertical
downsample
(a) (b) (c) (d)
vertical
upsample
horizontal
shear
x
y
u
v
x
1
y
1
u
1
v
1
x
2
y
2
u
2
v
2
x’
y’
u’
v’
Figure 4: Horizontal 3pass shear: (a) original pixel grid, image, and its Fourier transform; (b)
vertical upsampling onto the blue lines; (c) horizontal shear onto the diagonal red lines; (d) ﬁnal
vertical downsampling. The ﬁrst row shows the sampling grids, the second row shows the images
being resampled, and the third row shows their corresponding spectra. The frequency spectra in
the third row are scaled to the unit square with maximum frequencies (±1, ±1) in order to make
the upsampling and downsampling operations more intuitive.
9
Thus, a horizontal shear in the spatial domain induces a vertical shear in the frequency domain,
which can lead to aliasing if we do not ﬁrst upsample the signal vertically. (Notice how the original
frequency (±1, ±1) gets mapped to (±a
0
, ±1 ± a
1
), which can be beyond the vertical Nyquist
frequency.)
3
In order to avoid aliasing, we propose a threepass algorithm, which consists of the following
steps:
1. upsample vertically by the factor r ≥ 1 + a
1
;
2. shear and scale horizontally, with ﬁltering to avoid aliasing;
3. lowpass ﬁlter and downsample vertically.
In terms of geometric transformation, this corresponds to factoring
A =
a
0
a
1
0 1
=
1 0
0 1/r
a
0
a
1
/r
0 1
1 0
0 r
= A
1
A
2
A
3
, (17)
and applying the sequence of transformations
x = A
1
x
1
, x
1
= A
2
x
2
, x
2
= A
3
x
0
, (18)
as shown in Figure 4. The transpose of the middle matrix A
2
is
A
T
2
=
a
0
0
a
1
/r 1
, (19)
which, when multiplied by the maximum frequency present in the upsampled signal, (±1, ±
1
/
r
)
still lies inside the Nyquist range u
0
∈ [−1, +1]
2
(after horizontal ﬁltering, which is applied during
the scale/shear).
In our operational notation, this can be written as
f
1
= R
v
(f, h, 1, 1/r, 0, 0); (20)
f
2
= R
h
(f
1
, h, max(1, a
0
), a0, a
1
/r, t); (21)
f
3
= R
v
(f
2
, h, r, r, 0, 0). (22)
10
vertical
filter
horizontal
shear
(a) (b) (c)
Figure 5: Horizontal 2pass shear: (a) original Fourier transform; (b) vertical lowpass ﬁltering;
(c) horizontal shear. Notice how some high vertical frequencies are lost.
An alternative to this threestep process is a suboptimal twopass algorithm:
1. vertically lowpass ﬁlter the image with a bandwidth 1/r;
2. shear, scale, (and preﬁlter) the image horizontally.
As we can see in Figure 5c, this results in a loss of some highfrequency information, compared to
Figure 4d. (See the Experiments section for some examples.)
Unfortunately, the above derivation suggests that the upsampling rate can get arbitrarily high
as a
1
 1. In fact, the maximum upsampling rate need never exceed r = 3 (Figure 6). This is
because the pink portions of the spectrum along the left and right edge (high horizontal frequencies)
do not appear in the ﬁnal image, and can therefore be ﬁltered away during the horizontal shear.
To compute a better value for r, we ﬁrst compute the maximum values of the original frequen
cies that will appear in the ﬁnal image, (u
max
, v
max
), as shown in Figure 6e. The value of v
max
can be less than 1 if we are considering the general form of a shear matrix with vertical scaling
included,
A =
a
00
a
01
0 a
11
=
1 0
0 a
11
/r
a
00
a
01
/r
0 1
1 0
0 r
, (23)
where we have combined the vertical scaling with the initial vertical resampling stage. To avoid
3
From here on we use the convention that the frequencies range over [−1, +1], since this simpliﬁes our notation.
11
vertical
upsample
(a) (b) (c)
vertical
downsample
(d)
⅓
u
max
horizontal
shear
u
max
(e)
v
max
Figure 6: Maximum vertical resampling rate: (a) original Fourier transform; (b) vertical upsam
pling by a factor of 3; (c) horizontal shear and lowpass ﬁltering horizontally; (d) ﬁnal vertical
downsampling; (e) general case for computing u
max
and v
max
. Because horizontal frequencies
start being suppressed (moved to beyond the Nyquist frequency), it is not necessary to upsample
by more than a factor of 3.
aliasing, we must then ensure that
A
T
2
±u
max
±a
11
v
max
/r
=
±a
00
u
max
±a
01
u
max
/r ± a
11
v
max
/r
(24)
lies within the bounding box [−1, +1]
2
, i.e., that a
01
u
max
/r + a
11
v
max
/r ≤ 1 or r ≥ a
01
u
max
+
a
11
v
max
. Whenever a
11
v
max
> 1, we can clamp this value to 1, since there is no need to further
upsample the signal. When the vertical scaling a
11
is sufﬁciently small (magniﬁcation), r < 1.
Since there is no risk of aliasing during the horizontal shear, we set r = 1 and drop the ﬁnal
vertical downsampling stage. The formula for r thus becomes
r ≥ max(1, a
01
u
max
+ min(1, a
11
v
max
)). (25)
The ﬁnal three (or two) stage resampling algorithm is therefore:
f
1
= R
v
(f, h, 1/v
max
, a
11
/r, 0, 0); (26)
f
2
= R
h
(f
1
, h, max(1, a
00
), a
00
, a
01
/r, t); (27)
f
3
= R
v
(f
2
, h, r, r, 0, 0), (28)
where the last stage is skipped if r = 1.
12
Figure 7: Shearing algorithm results: (a) trilinear MIPmapping with anisotropic ﬁltering; (b)
EWA ﬁltering, (c) suboptimal vertical blur only algorithm; (d) optimal threestage algorithm.
Please see our online web page http:// research.microsoft.com/ enus/ um/redmond/ groups/ivm/
HQMPIR/ for more results as well as animations that better show the aliasing artifacts.
Figure 7 shows some examples of shears rendered using our optimal threepass algorithm,
the suboptimal (blurred) 2pass shear algorithm, as well as a variety of previously developed
resampling algorithms.
6 General afﬁne
It is well known that any 2D afﬁne transform can be decomposed into two shear operations (Heck
bert 1989, Wolberg 1990). For example, if we perform the horizontal shear ﬁrst, we have
A =
a
00
a
01
t
0
a
10
a
11
t
1
0 0 1
=
b
0
b
1
t
2
0 1 0
0 0 1
1 0 0
a
10
a
11
t
1
0 0 1
, (29)
with
b
0
= a
00
− a
01
a
10
/a
11
, b
1
= a
01
/a
11
, and t
2
= t
0
− a
01
t
1
/a
11
. (30)
Notice that the above algorithm becomes degenerate as a
11
→ 0, which is a symptom of the
bottleneck problem (Wolberg 1990). Fortunately, we can transpose the input (or output) image and
adjust the transform matrix accordingly.
13
To determine whether to transpose the image, we ﬁrst rescale the ﬁrst two rows of A into unit
vectors,
ˆ
A =
ˆa
00
ˆa
01
ˆa
10
ˆa
11
=
a
00
/l
0
a
01
/l
0
a
10
/l
1
a
11
/l
1
, (31)
where l
i
=
q
a
2
i0
+ a
2
i1
. We then compute the absolute cosines of these vectors with the x and y
axes, ˆa
00
 and ˆa
11
, and compare these to the absolute cosines with the transposed axes, i.e., ˆa
01

and ˆa
10
. Whenever ˆa
00
 + ˆa
11
 < ˆa
01
 + ˆa
10
, we transpose the image.
Having developed a threepass transform for each of the two shears, we could concatenate these
to obtain a sixpass separable general afﬁne transform. However, it turns out that we can collapse
some of the shears and subsequent upsampling or downsampling operations to obtain a fourpass
transform, as shown in Figure 8.
The trick is to perform the horizontal upsampling needed for later vertical shearing at the same
time as the original horizontal shear. In a similar vein, the vertical downsampling can be performed
in the same pass as the vertical shear and scale.
In terms of geometric transformations, this corresponds to a factorization of the form
A =
1 0
0 a
11
/r
v
b
0
a
01
/r
v
t
2
0 1 0
1/r
h
0
0 1
1 0
0 r
v
1 0 0
a
10
/(a
11
r
h
) 1 t
1
/a
11
r
h
0
0 1
=
1 0 0
0 a
11
/r
v
0
0 0 1
b
0
/r
h
a
01
/r
v
t
2
0 1 0
0 0 1
(32)
1 0 0
a
10
r
v
/(a
11
r
h
) r
v
t
1
r
v
/a
11
0 0 1
r
h
0 0
0 1 0
0 0 1
.
In order to compute the appropriate values for r
v
and r
h
, we must ﬁrst determine which fre
quency in the original image needs to be preserved in the ﬁnal image, as shown in Figure 6e.
Frequencies that get mapped completely outside the ﬁnal spectrum can be preﬁltered away during
the upsampling stages, thereby reducing the total number of samples generated. We compute the
14
vertical shear
+ downsample
(a) (b) (c) (d)
vertical
upsample
horizontal shear
+ upsample
horizontal
downsample
(e)
Figure 8: 4pass rotation: (a) original pixel grid, image, and its Fourier transform; (b) vertical
upsampling; (c) horizontal shear and upsampling; (d) vertical shear and downsampling; (e) hori
zontal downsampling. The general afﬁne case looks similar except that the ﬁrst two stages perform
general resampling.
15
(a) (b) (c) (d)
Figure 9: Twodimensional chirp pattern afﬁnely resampled using: (a) highquality bicubic; (b)
trilinear MIPmap with anisotropic ﬁltering; (c) EWA ﬁlter; (d) high quality fourpass rendering
(this paper). Please zoom in on the images to see more details.
values of (u
max
, v
max
) by intersecting the [−1, +1]
2
square with the projection of the ﬁnal spectrum
onto the original spectrum through A
−T
(the dashed blue lines in Figure
˜
refﬁg:rotation).
Once we know the (u
max
, v
max
) extents, we can compute the upsampling rates using
r
v
≥ max(1, a
01
u
max
+ min(1, a
11
v
max
)) and (33)
r
h
≥ max(1, a
10
/a
11
r
v
v
max
+ min(1, b
0
u
max
)). (34)
The ﬁnal fourstep algorithm therefore consists of:
f
1
= R
v
(f, h, 1/v
max
, a
11
/r
v
, 0, t
1
); (35)
f
2
= R
h
(f
1
, h, 1/u
max
, b
0
/r
h
, a
01
/r
v
, t
2
); (36)
f
3
= R
h
(f
2
, h, r
v
, r
v
, a
10
r
v
/(a
11
r
h
), 0); (37)
f
4
= R
v
(f
3
, h, r
h
, r
h
, 0, 0). (38)
If mipmaps or ripmaps are being used (Section 4), the amount of initial downsampling can
be reduced by ﬁnding the smallest pyramid level that contains the frequency content needed to
reconstruct the warped signal and adjusting the matrix entries in A appropriately.
7 Experimental results
We have evaluated our algorithm on both synthetic and natural images. In our online web page
http://research.microsoft.com/enus/um/redmond/groups/ivm/HQMPIR/, results are shown for pure
16
zoom (Figure 3), the three pass 1d shear (Figure 4), and two four pass cases  simultaneous rota
tion plus zoom and tilted orthographic rotation (Figure 9). In each case we also show results for
the highest quality Windows GDI+ warp, EWA, and the NVidia 7800 GPU performing trilinear
anisotropic texture ﬁltering. Filtering artifacts are sometimes more apparent under animation, so
we provide animated examples of each of these transforms.
The synthetic image we use for most of our evaluation is a 2D chirp (Figure 9), since it contains
a broad spectrum of frequencies up to the Nyquist rate in both u and v. Using this pattern, excessive
lowpass ﬁltering appears as a dimming of the chirp and aliasing appears as a Moir
´
e pattern. The
equation we used to generate the chirp is:
f(x, y) = cos(2πkx2
x
) · cos(2πky2
y
) (39)
In our evaluation we show the transformed images as well as the result of applying the trans
form and then applying the inverse transform to warp back to the original pixel grid. For animated
results, this roundtrip transform is useful because the eye is not distracted by the motion of the
image and can focus on the blur and aliasing artifacts (Dodgson 1992).
We have also evaluated the results on the natural images shown in Figure 10. For these, results
we refer users to the supplementary web page, where the ability to compare images and the ani
mations show the beneﬁts of our algorithm. The greatest aliasing effects can be seen for diagonal
frequencies in the GDI+ and GPU anisotropic ﬁlters, and this can be seen as Moir
´
e patterns in
areas of ﬁne detail. The balance between aliasing and blurring can be set subjectively for EWA
by altering the α parameter, but one or both are always present. When viewing our results, notice
that our algorithm has far less aliasing, evident by the lack of Moir
´
e in both the chirp results and
the picket fence of the lighthouse image. Other algorithms also display more aliasing in the wheel
areas of the bikes image. Fortunately, the lack of aliasing in our algorithm does not come at the ex
pense of sharpness, since we maintain high frequency details where other techniques have blurred
them out.
17
Figure 10: (a) Lighthouse test image; (b) Bikes test image.
8 Extensions and future work
While this paper has developed the basic theory for multipass nonaliasing afﬁne transforms, it
also suggests a number of promising directions for future research. These include optimizing the
order of shears and scalings, rendering triangle meshes, memory optimizations, and full perspective
warping.
Order of shears In this paper, we have implemented the general afﬁne warp as a horizontal shear
followed by a vertical shear (with additional transpose and/or resampling stages, as needed). In
theory, the order of the shears should not matter if perfect ﬁltering is used. In practice, nonideal
ﬁlters may induce a preferred ordering, as might computational considerations. For example, we
may want to defer large amounts of magniﬁcation until later stages in the pipeline.
Filter selection and optimization In our current implementation, we have used a 4 cycle (9
tap) windowed sinc ﬁlter, since it provides a reasonable tradeoff between quality and efﬁciency.
Using more taps (up to 6 or so) results in slight reductions in aliasing, while using fewer results in
signiﬁcant degradation. It would be worthwhile to investigate alternative ﬁlters, especially those
18
designed to reduce ringing while preserving high frequency content. Dodgson (1992) contains a
nice discussion of nonlinear ﬁlters that might be appropriate.
Triangle mesh rendering In this paper, we haven’t said anything about the sizes of the interme
diate images required during the intermediate stages. If we generalize the ﬁnal image to a triangle
or polygon, we can warp its shape back through successive stages and also add required additional
pixels around the boundaries to ensure that each ﬁlter has sufﬁcient support. Only the pixels inside
each support region would then have to be resampled.
A more interesting question is whether triangle meshes could be rendered by reusing warped
and ﬁltered samples from adjacent triangles. This is a subtle issue, since if there are large discon
tinuities in the local afﬁne transforms across triangle edges, visible errors might be induced.
Tiling and pipelining While GPUs are relatively insensitive to the order in which texture mem
ory pixels are fetched (so long as pixels are reused several times during computation), the same
is not true for regular CPU memory. Optimizing memory accesses by breaking up the image into
smaller 2D tiles and potentially pipelining the computation may results in signiﬁcant speedups.
Perspective The family of multipass scanline algorithms such as (Catmull and Smith 1980) on
which our work is based includes transforms such as perspective. We have not yet fully devel
oped the theory of optimal multipass perspective algorithms because achieving full computational
efﬁciency is tricky.
Perspective resampling is usually implemented by locally computing an afﬁne approximation
to the full transform and using its parameters to control the amount of ﬁltering (Greene and Heck
bert 1986, Wolberg 1990, McCormack et al. 1999). We could easily take Catmull and Smith’s
original 2pass perspective transform and replace each stage with an optimal perpixel polyphase
ﬁlter. (The ﬁlter bandwidth would vary spatially.) The difﬁculty lies in computing the amount
of upsampling that needs to be applied before each onedimensional (perspective) shearing stage.
Since this quantity is nonlinear in the afﬁne parameters because of the absolute value, there is
no rational linear formula that would locally determine the amount of upsampling required, which
19
leads to a more complex algorithm. We could always just upsample each image/stage by the theo
retical maximum of r = 3; instead we leave the development of the full perspective case to future
work.
9 Conclusions
In this paper, we have developed a 4stage scanline algorithm for afﬁne image warping and re
sampling. Our algorithm uses optimal onedimensional ﬁltering at each stage to ensure that the
image is neither excessively blurred nor aliased, which is not the case for previously developed al
gorithms. Because each stage only uses onedimensional ﬁlters, the overall computation efﬁciency
is very good, being amenable to GPU implementation using pixel shaders. While our algorithm
may not be suitable for some applications such as high polygon complexity scenes, we believe that
it forms the basis of a new family of higher quality resampling and texture mapping algorithms
with wide applicability to scenes that require high visual ﬁdelity.
References
AkenineM
¨
oller, T. and Haines, E. (2002). RealTime Rendering. A K Peters, Wellesley, Mas
sachusetts, second edition.
Barkans, A. C. (1997). High quality rendering using the Talisman architecture. In Proceedings
of the Eurographics Workshop on Graphics Hardware.
Betrisey, C. et al.. (2000). Displaced ﬁltering for patterned displays. In Society for Information
Display Symposium,, pages 296–299.
Catmull, E. and Smith, A. R. (1980). 3d transformations of images in scanline order. Computer
Graphics (SIGGRAPH’80), 14(3), 279–285.
Dodgson, N. A. (1992). Image Resampling. Technical Report TR261, Wolfson College and
Computer Laboratory, University of Cambridge.
20
Ewins, J. et al.. (1998). Mipmap level selection for texture mapping. IEEE Transactions on
Visualization and Computer Graphics, 4(4), 317–329.
Greene, N. and Heckbert, P. (1986). Creating raster Omnimax images from multiple perspective
views using the elliptical weighted average ﬁlter. IEEE Computer Graphics and Applications,
6(6), 21–27.
Heckbert, P. (1986). Survey of texture mapping. IEEE Computer Graphics and Applications,
6(11), 56–67.
Heckbert, P. (1989). Fundamentals of Texture Mapping and Image Warping. Master’s thesis,
The University of California at Berkeley.
H
¨
uttner, T. and Straßer, W. (1999). Fast footprint MIPmapping. In 1999 SIGGRAPH / Euro
graphics Workshop on Graphics Hardware, pages 35–44.
McCormack, J., Perry, R., Farkas, K. I., and Jouppi, N. P. (1999). Feline: Fast elliptical lines for
anisotropic texture mapping. In Proceedings of SIGGRAPH 99, pages 243–250.
Mitchell, D. P. and Netravali, A. N. (1988). Reconstruction ﬁlters in computer graphics. Com
puter Graphics (Proceedings of SIGGRAPH 88), 22(4), 221–228.
Oppenheim, A. V., Schafer, R. W., and Buck, J. R. (1999). DiscreteTime Signal Processing.
Prentice Hall, Englewood Cliffs, New Jersey, 2nd edition.
Schilling, A., Knittel, G., and Straßer, W. (1996). Texram: A smart memory for texturing. IEEE
Computer Graphics & Applications, 16(3), 32–41.
Smith, A. R. (1981). Digital Filtering Tutorial for Computer Graphics. Technical
Memo 27, Computer Graphics Project, Lucasﬁlm Ltd. Revised Mar 1983, available on
http://alvyray.com/Memos/MemosPixar.htm.
Triggs, B. (2001). Empirical ﬁlter estimation for subpixel interpolation and matching. In Eighth
International Conference on Computer Vision (ICCV 2001), pages 550–557, Vancouver, Canada.
21
Unser, M. (1999). Splines: A perfect ﬁt for signal and image processing. IEEE Signal Processing
Magazine, 16(6), 22–38.
Williams, L. (1983). Pyramidal parametrics. Computer Graphics, 17(3), 1–11.
Wolberg, G. (1990). Digital Image Warping. IEEE Computer Society Press, Los Alamitos.
22