Conference PaperPDF Available

Viewpoint Consistent Texture Synthesis.

Authors:

Abstract and Figures

The purpose of this work is to synthesize textures of rough, real world surfaces under freely chosen viewing and illumination directions. Moreover, such textures are produced for continuously changing directions in such a way that the different textures are mutually consistent, i.e. emulate the same piece of surface. This is necessary for 3D animation. It is assumed that the mesostructure (small-scale) geometry of a surface is not known, and that the only input consists of a set of images, taken under different viewing and illumination directions. These are automatically aligned to build an appropriate bidirectional texture function (BTF). Directly extending 2D synthesis methods for pixels to complete BTF columns has drawbacks which are exposed, and a superior sequential but highly parallelizable algorithm is proposed. Examples demonstrate the quality of the results.
Content may be subject to copyright.
Viewpoint Consistent Texture Synthesis
Alexander Neubeck, Alexey Zalesny, and Luc Van Gool
Swiss Federal Institute of Technology Zurich
Computer Vision Lab
{aneubeck,zalesny,vangool}@vision.ee.ethz.ch
Abstract
The purpose of this work is to synthesize textures of
rough, real world surfaces under freely chosen viewing and
illumination directions. Moreover, such textures are pro-
duced for continuously changing directions in such a way
that the different textures are mutually consistent, i.e. emu-
late the same piece of surface. This is necessary for 3D an-
imation. It is assumed that the mesostructure (small-scale)
geometry of a surface is not known, and that the only in-
put consists of a set of images, taken under different view-
ing and illumination directions. These are automatically
aligned to build an appropriate Bidirectional Texture Func-
tion (BTF). Directly extending 2D synthesis methods for
pixels to complete BTF columns has drawbacks which are
exposed, and a superior sequential but highly parallelizable
algorithm is proposed. Examples demonstrate the quality of
the results.
1. Introduction
Textures are often used to convey an impression of sur-
face detail, without having to invest in fine-grained geom-
etry. Starting from example images, several methods can
model textures and synthesize more extended areas of them
[1,4–7, 13,14,16]
A shortcoming of early methods was that these textures
would remain fixed under changing viewing or lighting di-
rections, whereas the 3D structures they were supposed to
mimic would lead to changing self-occlusion and shadow-
ing effects. Therefore, more recent texture modeling and
synthesis methods include such effects [3,8,10–12,15], with
some of our own work as an early example [18,19]. These
recent methods can be used to cover curved surfaces with
textures that adapt their parameters to the different, local
orientations with respect to the viewpoint and light sources.
The gain in realism when compared to fixed texture fore-
shortening and shading is often striking.
Animations, where the viewpoint or light sources move
with respect to the surface, add the requirement of texture
consistency over time. Subsequent textures for the same
surface patch should seem to all visualize the same physical
surface structure. Only a subset of the previous methods can
produce such consistent time series of textures. Examples
are those which explicitly retrieve surface height or normal
information [3, 12] or stitch together texton representations
that already include their appearance under several viewing
conditions [11, 15] and where the ‘textons’ can shrink to a
single pixel [8, 10]. Both approaches limit the geometric
complexity that can be handled. For the former this imme-
diately stands to reason, but it may be less obvious for the
latter. We will return to this issue shortly.
In this paper we propose a novel method of the latter
type, i.e. based on a kind of copy-and-paste approach. It
tries to overcome the limitations of current methods, by
widening the choice of basic material to copy from with-
out the need for additional sample images. Also, we deal
with real, colored textures, rather than synthetic or grey
level ones, as has sometimes been the case.
The paper is organized as follows. Section 2 describes
the input images that we use, as well as the way in which
the incoming information is organized. Section 3 describes
the texture synthesis method that exploits this information
to produce consistent textures. Section 4 shows examples.
Section 5 concludes the paper.
2. Bidirectional Texture Function
2.1. Images stacks as input
The input consists of images of a planar sample of
the texture, taken under known viewing conditions (known
viewing and illumination directions). The biggest publicly
available image database of this kind is the CUReT [2],
which has some insufficiencies. Firstly the representation is
not complete because for each illumination direction there
is only a 1D trajectory of viewing angles. Secondly, im-
age quality is rather low, especially in terms of the colors.
Another, but much smaller database has been produced by
Koudelka et al. [10]. Hence, we have constructed a spe-
Figure1.Setup for taking pictures under variable viewing
and illumination directions.
cially designed apparatus, shown in Fig. 1. It consists of
a camera arm that can be moved to specified inclinations,
several light sources that can be individually activated, and
a turntable on which the texture sample is placed. The arm
and table rotate about orthogonal axes, thereby covering a
complete hemisphere of directions. The sessions are com-
puter controlled, so that the user only needs to specify de-
sired intervals. All images are then taken automatically at
the desired angular resolutions, except for the illumination
directions, which are limited by the light sources present in
the setup. There have been only 4 lamps so far, but many
more are planned in the follow-up version of the setup. We
plan to make the texture data available [17].
Starting from the different images the bidirectional tex-
ture function or ‘BTF’ can be constructed. The BTF repre-
sentation was introduced by Dana et al. [2]. It contains the
intensities or colors observed for a certain point (specified
by its ‘texture coordinates’) on the texture, for each viewing
and lighting directions. Hence, it is a 6D function. In prac-
tice this function is sampled by thousands of images taken
under different illumination and viewing directions, hence
the need for a largely automated setup as just described.
These images are rectified by projective transformations, to
let their outlines precisely fit those of the frontal view. The
determination of the transformations is facilitated by a set
of markers placed around the texture patch on the turntable.
Fig. 2 shows two images of the same texture (an oblique
view on the left and the frontal view on the right). The
image in the middle is the image on the left, projectively
aligned with the frontal view on the right. Such alignment
Figure 2. Stack alignment. Left: oblique view, middle:
oblique view after alignment to frontal view, right: frontal
view.
Figure 3. Lichen texture. left: frontal view, right: oblique
view.
is part of the automatic image capturing procedure. This
alignment removes the global, perspective distortions. We
will refer to the complete set of aligned images of a texture
as a BTF stack. When only the data at a fixed pixel location
is considered, we refer to the corresponding data subset as
aBTF column.
One such BTF column of the “lichen” texture example
of Koudelka et al. [10] (Fig. 3) is visualized in Fig. 4. The
same pixel intensities are shown twice, using two different
orderings of the imaging conditions. Each small block of
the top image consists of pixels ordered according to the
two viewing angles, whereas the blocks themselves are or-
Figure 4. BTF column visualization (cutout). Top: each
block represents tilt/pan angles of viewing direction, blocks
are arranged by tilt/pan angles of lighting direction. Bot-
tom: Switched viewing and lighting angles in the ordering.
dered according to the two lighting angles. In the bottom
image the roles of viewing and lighting are swapped. Al-
ready at first glance it is clear that the intensities within the
blocks are smoother when they are arranged by lighting an-
gles. This has to do with the 3D nature of the real sur-
face: changing only the lighting and keeping the viewing
direction fixed ensures that also in such case a fixed pixel
in the images will correspond to one fixed point on the sur-
face. In case the viewing direction is changed, alignment
through a simple planar projectivity cannot avoid that the
same pixel will now correspond to different points on the
surface, thereby increasing the changes in intensities.
Smoothness in the BTF is important, as it improves the
creation of intermediate views. BTF-based rendering is usu-
ally based on pixelwise linear interpolation between nearest
views. The more similar these neighboring views are, the
better it works. The same holds for the synthesis of en-
tirely novel texture patches by smart copy-and-pasting of
BTF data (a strategy used in e.g. [8, 10]). Seamless knitting
based on smooth functions is easier than for functions with
lots of variation.
This brings us to an issue that has not received much
attention yet, but that has an important impact on the use-
fulness of BTF stacks. Due to the 3D nature of most real
textures, the BTF is not unique. So far images were aligned
to the coplanar markers on the turntable, but a similar align-
ment based on parallel planes at different heights yields
different BTFs for exactly the same texture patch. BTF
smoothness can be increased by making an optimal choice.
This is discussed next.
2.2. Improved BTF Alignment
As mentioned in the last section, the BTF representation
of a texture is not unique, as it depends on the choice of the
alignment plane. This choice will also have an influence
on the smoothness of the BTF, and therefore on its poten-
tial for texture synthesis. As mentioned, the alignment to
a reference plane (like that of the turntable markers) can-
not avoid that the same pixel will correspond to different
physical points on the surface (texture sample). This drift is
illustrated in Fig. 5. If not all surface points lie in the same
plane, images taken under different viewing directions can-
not be aligned to map all points onto each other, even under
the simplifying assumption of parallel projection. On the
other hand, drift effects can be minimized and thus BTF
smoothness increased by aligning with respect to a plane
within the height range of the texture.
For texture synthesis based on copy-and-pasting of BTF
data – which is also the basis of the approach presented in
this paper – it stands to reason that a good BTF stack is one
with maximal smoothness. Such stack could also support
further subsampling: if a view can be interpolated very well
texture surface
projection planes
view 2
view 2
view 2
view 1
view 1
view 1
drift
Figure 5. Nonuniqueness of BTF representation. A sur-
face point is mapped to distinct pixels in the BTF. The pixel
drift depends on the projection plane position.
by nearby views, it can be skipped. The smoothness is max-
imized by choosing the alignment plane which minimizes
the average Euclidean distance between the intensities of
neighboring views sharing the same lighting direction. That
is, the appearance change is measured as a pixelwise Eu-
clidean distance with respect to the four neighboring cam-
era positions. The average value of this Euclidean distance
over the whole BTF stack is calculated for several plane po-
sitions within a reasonable range. The plane corresponding
to the minimal distance is chosen. In Section 4 we show
an example of the beneficial influence that such selection
of the alignment plane has on the quality of the rendering
(Fig. 10).
2.3. Column vs. view-specific BTF copying
Suppose we extend a smart copying type of texture syn-
thesis for single views [1, 4, 5, 13, 16] to complete BTF
columns, i.e., complete BTF columns are copied and pasted
instead of RGB values (like in [8,10]), based on their com-
patibilities with already synthesized parts of the texture. It
would be extremely difficult to avoid strong seams, except
for very regular textures.
This can be explained as follows. Consider the simple
artificial texture shown in Fig. 6 (left). It was created by
the superposition of Gaussian functions, that are assumed to
simultaneously specify surface height and intensity. They
were centered at the nodes of a randomly deformed grid
and also their widths (standard deviations) were randomly
chosen. For this surface the BTF stack was generated by
orthogonal projections. Due to the 3D nature of the tex-
ture, BTF columns will always comprise information about
a neighborhood rather than a single surface point. Differ-
ently shaped Gaussians have spots with similar orientations
and would therefore share similar BTF columns if this in-
formation mixing would not occur. But as BTF columns are
bound to combine information from several surface points,
Figure 6. Artificial texture. Left: original, middle: copy-and-paste applied to complete BTF columns, right: result of a per-view
copy-and-paste synthesis.
differently shaped Gaussians will no longer share any BTF
column. A single BTF column already contains sufficient
data to reconstruct the Gaussian blob it was sampled from.
As all these Gaussians have different shapes, their BTF
columns will not be easy to combine. Hence, except for
verbatim copying of large chunks – which is often undesir-
able and would still result in seams between these chunks
– it is nontrivial to make neighboring columns consistent
and thereby avoid seams. Chances of forming new Gaus-
sian shapes are slim. The result of a copy-and-paste strat-
egy (see section 3.1) applied to complete BTF columns for
this example is shown in the middle image of Fig. 6. The
higher complexity of real surfaces only worsens the prob-
lem. It will be very improbable to find two BTF columns
in a rough irregular texture patch, which share the same re-
flectance properties for all viewing and lighting directions.
This makes synthesis by copying whole BTF columns al-
most impossible.
Although the choice of an appropriate alignment plane
reduces the problem of finding compatible BTF columns,
this will not suffice. Taking much larger texture samples and
therefore increasing the number of BTF columns to choose
from can also remedy this problem to some extent, but is
not always practical. Thus, on top of alignment plane op-
timization we propose an alternative strategy to enlarge the
choice of samples.
The proposed approach no longer copies and pastes com-
plete BTF columns, but only the data from the BTF that are
relevant for a specific viewing condition, i.e. only a small
part of the BTF. Hence, rather than synthesizing all views si-
multaneously, they are created one by one. Initially, a single
view is synthesized, which can be done with any traditional
texture synthesis algorithm. Then, this so-called support
view is used to guide the synthesis of the other views. In our
experiments we have always used a frontal view as the sup-
port view, but this is not necessary. The support view has to
ensure that the different textures are consistent. The small
part of the BTF columns that is used consists of the data
(intensity or color) for the support view and of the desired
view (different viewing and or lighting directions). Now
only small parts of the BTF columns need to be compatible,
thereby drastically increasing the choice, as we will com-
bine parts from different columns to synthesize the corre-
sponding pixel in different views. The price clearly is a loss
in efficiency, but the quality of such procedure is superior,
as is shown in Fig. 6(right) for the artificial example with
the Gaussians. There are no longer seams visible, while
simultaneously the variation in acceptable hill shapes has
increased.
During this sequential synthesis two contradicting issues
must be solved: texture quality and viewpoint consistency.
On the one hand seams might be better hidden when there
are more choices of stitching patches together. On the other
hand, requiring view consistency still amounts to restricting
the choice. The intuition behind our approach is similar to
Image Analogies [9]: given the original frontal view A, its
viewpoint consistent oblique view Band a synthetic frontal
view A(the support view), the synthetic oblique view B
has to be created in a way consistent with A. From a BTF
column, only the entries from Aand Bare used. The copy-
and-paste process is led both by consistency with a pixel’s
neighborhood in the fixed support Aand the data already
available in the corresponding neighborhood within the B
view under construction. As the synthesis of Bproceeds,
a dynamic weighting scheme increases its influence in this
comparison. After carrying out such synthesis for the differ-
ent possible views B, a complete stack can be build again,
where the BTF columns are no longer one-to-one copies of
original BTF columns. Instead, they are compositions of
several columns, thereby yielding many more possibilities.
Fig. 7 compares columnwise synthesis (left) with our se-
quential approach (right) for a real colored white-pea tex-
ture. Columnwise synthesis again produces more salient
seams.
Figure 7. Left: columnwise synthesis; There are salient
seams visible. Right: sequential synthesis.
3. Synthesis Algorithm
As described in the previous section our sequential syn-
thesis scheme first synthesizes a single view, the support
view, and then the remaining ones. Therefore first the de-
tails of the single view synthesis are explained.
3.1. Single View Synthesis
Our single view synthesis algorithm combines the
non-parametric multiscale texture synthesis of Paget and
Longstaff [13], which samples from an example image to
build a similar output texture, with Ashikhmin’s candidate
search [1]. The latter reduces the search complexity for sim-
ilar neighborhoods by introducing a reasonable subset of
possible candidates from the example image.
At the start the output image – which is to become the
support view – is randomly initialized such that its his-
togram equals the histogram of the input image. Then a
pixel of the output image is randomly chosen and a set of
possible candidate pixels in the input image is created (this
will be explained shortly). Next the neighborhood of the
chosen output pixel is compared to the neighborhoods of all
the candidates. It is replaced by the intensity (or color) of
the input candidate pixel with the best matching neighbor-
hood. This procedure is repeated until all output pixels were
visited a number of times. Now follow more details.
To fill the chosen pixel of the output pixel with an inten-
sity (or color), its neighborhood is investigated in order to
select good candidate values from the input sample. More
precisely, input pixels are identified that have at least one
intensity (or color) in their neighborhood that is identical
and in the same relative neighborhood position. This is
illustrated in Fig. 8). The input pixel neighborhood with
the smallest Euclidean distance between the intensities (or
RGB-values) of its pixels and the corresponding ones of the
output pixel under scrutiny is considered to match best. The
output pixel is updated and takes on the intensity (or color)
of the input pixel with this neighborhood. As a matter of
fact, the updating procedure is iterative and when visiting a
pixel anew it is only updated if an input pixel with a better
matching neighborhood is found.
(i,j)
(p+u,q+v)
(p,q)
input texture output texture
(i+u,j+v)
Figure 8. Candidate neighborhood selection for synthe-
sis.
To guide the updating procedure, a ‘confidence’ is as-
signed to each output pixel. It specifies the chance of the
pixel to be selected for an update. The higher the confi-
dence, the lower this chance gets. Basically, the confidence
counts the number of previous updates a pixel has under-
gone, as this number gives a good idea of how. The con-
fidence values w(i, j)for all pixels (i, j)are initialized to
zero. Whatever the outcome, a constant value is added to
the pixel’s confidence:
w(i, j) := w(i, j ) + 1/T, (1)
where Tis the total number of times each pixel will be vis-
ited (as a rule T= 4). A pixel with confidence 1will no
longer be selected or updated. The chance to be selected for
a possible update is
P(i, j) = (1 w(i, j ))/X
(i,j)I mage
(1 w(i, j)),(2)
where w(i, j)[0,1] is the confidence of pixel (i, j)and
P(i, j)is the visiting probability.
The confidences also have a second role to play. The
more confidence we have in a pixel is, the more it con-
tributes to the Euclidean distance in the comparison be-
tween the input and output neighborhoods:
dist(i, j, p, q) = P(u,v )Nw(i+u, j +v)·
||out(i+u, j +v)in(p+u, q +v)||2,(3)
where the neighborhood N=
(u, v)|0< u2+v2<=r2is a disk of fixed radius
r,in(·)resp. out(·)are input resp. output intensities (for
neighbors outside the image boundary intensity differences
are set to the maximal value of 255).
To speed up the process and take account of the inter-
dependence of distant pixels, a multi-resolution scheme is
applied. It makes use of a multigrid representation of the
Figure 9. Multiscale grid scheme.
output image, as shown in Fig. 9. The full resolution is rep-
resented by the bottom grid, where each intersection point
’o’ corresponds to pixels of the image. The lower reso-
lutions are decimated versions of the preceding level, i.e.
pixels of grid level iare part of grid level i1. For con-
venience, pixels of decimated layers preserve their original
coordinates as they were within the bottom grid, so that the
neighborhoods are scaled up by a factor of 2. Such a coordi-
nate doubling scheme allows for every output layer to work
with the whole input image directly, without splitting it into
input layers: the candidate-neighborhoods can be placed ev-
erywhere, important is only that the neighborhood scaling
corresponds to the scale of the current layer. It guarantees
the smooth usage of the whole input information for every
layer even if the input and output image sizes are distinct.
The whole pyramid is initialized randomly according to
the histogram of the input image. The synthesis starts at the
coarsest grid and proceeds sequentially through each layer.
After synthesizing the first layer as described above, its pix-
els are copied to the next finer layer with a confidence equal
to 0.5(all other pixels of this finer layer still have confi-
dence 0). The confidence update (1) remains the same. This
procedure is repeated for every layer.
3.2. Sequential Synthesis of Other Views
After having synthesized the support view (typically a
frontal view) using the algorithm of the previous section,
other views are synthesized in a similar way, but guided by
this support. Indeed, for every other view we modify the
synthesis algorithm in order to introduce the dependence
on the synthetic support view and hence, to achieve view
consistency.
When sticking to the names used in the preliminary dis-
cussion in section 2.3, the support view is Aand the view
to be synthesized is B. Not single intensities, but pairs are
considered (or pairs of RGB triples for colored textures).
For the input, an intensity (RGB triplet) of the frontal view
Aand of the view B(for the same viewing and lighting di-
rections as B) are taken from the same BTF column. At
the output, we combine the intensities (colors) at a pixel of
the synthetic support view Aand at the corresponding pixel
of the view Bto be synthesized, where only the latter are
allowed to be changed.
As intensities (RGB triplets) have to be synthesized only
for view B, only neighboring pixels there are used for the
selection of input neighborhoods to compare against. But
this time the neighborhood comparisons also take the simi-
larities in the frontal views into account. The distance func-
tion is as follows:
dist(i, j, p, q) = P(u,v )Nw(i+u, j +v)·
||B(i+u, j +v)B(p+u, q +v)||2+
c·(1 w(i+u, j +v)) ·
||A(i+u, j +v)A(p+u, q +v)||2,
where c[0,1]. We need this constant because at the
beginning of the synthesis of B, the support view Ahas
about three times a stronger influence on the distance, and
must be suppressed a little; c= 0.25 over all experiments.
4. Results
Textures were recorded by the test setup described in sec-
tion 2.1. For the experiments, the camera slant step was 5
in the range of 0-70and the light source slant angle was
fixed. The tilt angle was changing due to the rotation of the
example texture about the vertical axis in steps of 15in
the range of 0-90. The fixed azimuth angle between cam-
era and illumination was about 135(except for the middle
texture in Fig. 11, where the angle was about 30), i.e. the
lamp rotated with the camera. This results in 98 images
per texture sample. For all examples given in this section,
alignment plane optimization was applied, as explained in
section 2.2.
The radius of the neighborhood in the algorithm was
chosen to be 3 or 4 pixels for the support views and 2 for
other views. The input image size was about 300 ×200 and
the synthetic textures have size 512 ×512.
In section 2 it was discussed that a good choice for the
alignment plane can help improve the results. Without op-
timizing the BTF representation for the interpolation task,
rendering might produce salient ghosts and blurring, as the
same surface patch drifts between views. This is illustrated
in Fig. 10. The image on the left is produced based on the
BTF obtained when the alignment plane is chosen to include
the markers of the turntable. As the texture was at a few
centimeters below this plane, the BTF suffers from rather
large drifts in the points on the surface that correspond to
the same image pixel. By picking an alignment plane that
minimizes these drifts (a plane lying within the height range
Figure 10. An intermediate view is interpolated from
its neighboring views. Left: the alignment plane is cho-
sen to coincide with the turntable marker plane (ghosts are
clearly visible); right: with the alignment plane at an opti-
mal height.
of the texture), a smoother BTF is obtained and much better
rendering results.
In Fig. 11 synthesis results are shown for our textures
‘M&Ms’, ‘white peas’, ceiling’, ‘doormat’, and ‘foam’.
From the two oblique views (middle and right columns),
one can see that the algorithm preserves view consistency
for a variety of regular and stochastic textures. Original
texture images with the same viewing conditions as for the
oblique, synthetic views in the middle column are shown
on the left, to illustrate the synthesis quality. Note however,
that the support views were not these but the substantially
different frontal views.
5. Conclusion
We have presented a method to synthesize textures for
different viewing conditions, that show mutual consistency.
One of the goals is animation. As input we use a set of
images, taken under multiple, known viewing and lighting
directions.
The actual synthesis is based on a novel copy-and-paste
type of algorithm. In order to solve problems with the copy-
ing of complete BTF columns – which is the usual approach
– we have proposed several modifications. These already
start at the level of input data preparation, through the opti-
mization of the alignment plane for building the BTF stacks.
This step has the effect of reducing the variations in the BTF
data, which already has a beneficial effect on the subsets
that can be matched. A further change is the separate syn-
thesis of the different views, based on a support view. Such
synthesis is based on the selection of only appropriate pairs
of values from entire BTF columns. This increases the num-
ber of choices dramatically. As an end effect, novel BTF
columns are composed from such pairs, that did not appear
in the input BTF stack.
We have shown examples which show that there are
fewer and less outspoken seams and inconsistencies with
this algorithm than with a similar one based on copy-and-
pasting complete BTF columns. The examples included
stochastic, regular, and near-regular textures.
Acknowledgements: The authors gratefully acknowl-
edge support through Swiss National Fund (SNF) project
200021-103850/1 ASTRA (Analysis and Synthesis of
Texture-Related Appearance) and EU IST project ‘Cogni-
tive Vision Systems’ CogViSys.
References
[1] M. Ashikhmin. Synthesizing natural textures. Symposium on Inter-
active 3D Graphics, pages 217–226, 2001.
[2] K. Dana, B. Ginneken, S. Nayar, and J. Koenderink. Reflectance
and texture of real-world surfaces. ACM Transactions on Graphics,
18(1):1–34, 1999.
[3] J. Dong and M. Chantler. Capture and synthesis of 3d surface tex-
ture. Texture 2002 Workshop, in conjunction with ECCV 2002,
pages 41–45, 2002.
[4] A. Efros and W. Freeman. Image quilting for texture synthesis and
transfer. SIGGRAPH 2001, Computer Graphics Proceedings, pages
341–346, 2001.
[5] A. Efros and T. Leung. Texture synthesis by non-parametric sam-
pling. Proc. Int. Conf. Computer Vision (ICCV’99), 2:1033–1038,
1999.
[6] G. Gimel’farb. Image textures and gibbs random fields. Kluwer
Academic Publishers: Dordrecht, pages 250–, 1999.
[7] G. Gimel’farb and D. Zhou. Fast synthesis of large-size textures
using bunch sampling. Proc. Int. Conf. Image and Vision Computing
2002, pages 215–220, Nov. 2002.
[8] Y. Hel-Or, T. Malzbender, and D. Gelb. Synthesis and rendering
of 3d textures. Texture 2003 Workshop accomp. ICCV 2003, pages
53–58, 2003.
[9] A. Hertzmann, C. E. Jacobs, N. Oliver, B. Curless, and D. H.
Salesin. Image analogies. SIGGRAPH 2001, Computer Graphics
Proceedings, pages 327–340, 2001.
[10] M. Koudelka, S. Magda, P. Belhumeur, and D. Kriegman. Acquisi-
tion, compression, and synthesis of bidirectional texture functions.
Texture 2003 Workshop accomp. ICCV 2003, pages 59–64, 2003.
[11] T. Leung and J. Malik. Recognizing surfaces using three-
dimensional textons. Proc. Int. Conf. Computer Vision (ICCV’99),
pages 1010–1017, 1999.
[12] X. Liu, Y. Yu, and H. Shum. Synthesizing bidirectional texture func-
tions for real-world surfaces. Siggraph 2001, Computer Graphics
Proceedings, pages 97–106, 2001.
[13] R. Paget and I. Longstaff. Texture synthesis via noncausal non-
parametric multiscale markov random field. IEEE Transactions on
Image Processing, 7(6), 1998.
[14] J. Portilla and E. Simoncelli. A parametric texture model based
on joint statistics of complex wavelet coefficients. International
Journal of Computer Vision, 40(1):49–70, 2000.
[15] X. Tong, J. Zhang, L. Liu, X. Wang, B. Guo, and H. Shum. Syn-
thesis of bidirectional texture functions on arbitrary surfaces. ACM
Transactions on Graphics, 21(3):665–672, 2002.
[16] L. Wei and M. Levoy. Fast texture synthesis using tree-structured
vector quantization. Siggraph 2000, Computer Graphics Proceed-
ings, pages 479–488, 2000.
[17] www.esat.kuleuven.ac.be/psi/visics/texture/.
[18] A. Zalesny and L. Van Gool. A compact model for viewpoint de-
pendent texture synthesis. SMILE 2000 Workshop, Lecture notes in
computer science, 2018:124–143, 2001.
[19] A. Zalesny and L. Van Gool. Multiview texture models. CVPR
2001, 1:615–622, 2001.
Figure 11. Consistent synthesis. Left: originals, middle/right: synthesized views for camera slant angles of 50/25, resp.
... Another sampling-based BTF synthesis method was published by Neubeck et al. [70]. The authors apply smart copy-and-paste smooth texture synthesis to the BTF synthesis. ...
Chapter
The Bidirectional Texture Function is the best recent visual texture representation which can still be simultaneously measured and modeled using state-of-the-art measurement devices and computers as well as the most advanced mathematical models of visual data. Thus it is the most important representation for the high-end and physically correct surface materials appearance modeling. This chapter surveys compression and modeling approaches available for this sophisticated textural representation.
... However, the importance of creating robust descriptions has been recognized (Zhu et al. 1998). A number of researchers have proposed to build representation by combining images from many views or illumination directions, which have been registered (Dana and Nayar 1998; Liu et al. 2001; Zalesny and Van Gool 2001; Neubeck 2004; Leung and Malik 2001; Chantler et al. 1988). Closest related to this work are a number of studies, which achieve robustness to view-point changes by using local descriptors or filter responses, which are rotationally or locally affine invariant (Cula and Dana 2001; Schmid 2001; Varma and Zisserman 2002, 2003). ...
Article
Full-text available
Image texture provides a rich visual description of the surfaces in the scene. Many texture signatures based on various statistical descriptions and various local measurements have been developed. Existing signatures, in general, are not invariant to 3D geometric transformations, which is a serious limitation for many applications. In this paper we introduce a new texture signature, called the multifractal spectrum (MFS). The MFS is invariant under the bi-Lipschitz map, which includes view-point changes and non-rigid deformations of the texture surface, as well as local affine illumination changes. It provides an efficient framework combining global spatial invariance and local robust measurements. Intuitively, the MFS could be viewed as a “better histogram” with greater robustness to various environmental changes and the advantage of capturing some geometrical distribution information encoded in the texture. Experiments demonstrate that the MFS codes the essential structure of textures with very low dimension, and thus represents an useful tool for texture classification.
... The method is extremely slow and was tested only on low-resolution CUReT data [10]. Another sampling-based BTF synthesis method was published by Neubeck et al. [74]. The authors apply smart copy-and-paste smooth texture synthesis to BTF synthesis. ...
Article
Full-text available
An ever-growing number of real-world computer vision applications require classification, segmentation, retrieval, or realistic rendering of genuine materials. However, the appearance of real materials dramatically changes with illumination and viewing variations. Thus, the only reliable representation of material visual properties requires capturing of its reflectance in as wide range of light and camera position combinations as possible. This is a principle of the recent most advanced texture representation, the Bidirectional Texture Function (BTF). Multispectral BTF is a seven-dimensional function that depends on view and illumination directions as well as on planar texture coordinates. BTF is typically obtained by measurement of thousands of images covering many combinations of illumination and viewing angles. However, the large size of such measurements has prohibited their practical exploitation in any sensible application until recently. During the last few years, the first BTF measurement, compression, modeling, and rendering methods have emerged. In this paper, we categorize, critically survey, and psychophysically compare such approaches, which were published in this newly arising and important computer vision and graphics area.
Article
Bidirectional Texture Functions (BTF) have proven to be a well-suited representation for the reproduction of measured real-world surface appearance and provide a high degree of realism. We present an approach for designing novel materials by interpolating between several measured BTFs. For this purpose, we transfer concepts from existing texture interpolation methods to the much more complex case of material interpolation. We employ a separation of the BTF into a heightmap and a parallax compensated BTF to cope with problems induced by parallax, masking and shadowing within the material. By working only on the factorized representation of the parallax compensated BTF and the heightmap, it is possible to efficiently perform the material interpolation. By this novel method to mix existing BTFs, we are able to design plausible and realistic intermediate materials for a large range of different opaque material classes. Furthermore, it allows for the synthesis of tileable and seamless BTFs and finally even the generation of gradually changing materials following user specified material distribution maps.
Conference Paper
Full-text available
This paper presents a novel image-based registration method for high-resolution multi-view images of a planar material surface. Contrary to standard registration approaches, this method aligns images based on a true plane of the material's surface and not on a plane defined by registration marks. It combines the camera calibration and the iterative fitting of desired position and slant of the surface plane, image re-registration, and evaluation of the surface alignment. To optimize image compression performance, we use an error of a compression method as a function evaluating the registration quality. The proposed method shows encouraging results on example visualizations of view- and illumination-dependent textures. In addition to a standard multi-view data registration approach, it provides a better alignment of multi-view images and thus allows more detailed visualization using the same compressed parameterization size.
Article
Full-text available
Multidimensional visual texture is the appropriate paradigm for physically correct material visual properties representation. The course will present recent advances in texture modelling methodology as applied in computer vision, pattern recognition, computer graphics, and virtual/augmented reality applications. Contrary to previous courses on material appearance, we will focus on materials whose nature allows the exploitation of texture modeling approaches. This topic is introduced in the wider and complete context of pattern recognition and image processing. It comprehends modeling of multi-spectral images and videos which can be accomplished either with multi-dimensional mathematical models or sophisticated sampling methods from the original measurements. The key aspects of the topic, i.e., different multi-dimensional data models with their corresponding benefits and drawbacks, optimal model selection, parameter estimation and model synthesis techniques, are discussed. These methods produce compact parametric sets that not only faithfully reproduce material appearance, but are also vital for visual scene analysis, e.g. texture segmentation, classification, and retrieval. Special attention is devoted to recent advanced trends towards Bidirectional Texture Function (BTF) modeling, used for materials that do not obey Lambertian law, and whose reflectance has non-trivial illumination and viewing direction dependency. BTFs represent the best known effectively applicable textural representation of most real-world materials' visual properties. The techniques covered include efficient Markov random field-based algorithms, intelligent sampling algorithms, spatially-varying reflectance models and challenges with their possible implementation on GPU. The course also deals with proper data measurement, visualization of texture models in virtual scenes, visual quality evaluation feedback, as well as description of key industrial and research applications. We will discuss options as to which type of material representation is appropriate for required application, what are its limits and possible modelling options, and what the biggest challenges in realistic modelling of materials are.
Conference Paper
Texture matching, that directly affects computational costs and the quality of inpainting, is one of the most important steps in texture synthesis applied to image inpainting. To obtain the global optimal solution efficiently, the multi-resolution and equal image partition is introduced in this paper. According to the resolution of input image, we build images with multi-resolution. Then all the images are partitioned into blocks equally and the proposed texture matching algorithm is block-based one. Based on the images with lower resolutions, the blocks with the most matching patches of lower sum of absolute differences (SAD) value are eliminated gradually. Performance evaluation is taken to the object removal. The experiments show that the inpainting results obtained by the proposed texture matching method are visually similar with the ones obtained by full search (FS) algorithm, while the inpainting algorithm with the proposed texture matching method is much more efficient than the one with FS method.
Article
Full-text available
A new algorithm is proposed for removing large objects from digital images. The challenge is to fill in the hole that is left behind in a visually plausible way. In the past, this problem has been addressed by two classes of algorithms: 1) "texture synthesis" algorithms for generating large image regions from sample textures and 2) "inpainting" techniques for filling in small image gaps. The former has been demonstrated for "textures"--repeating two-dimensional patterns with some stochasticity; the latter focus on linear "structures" which can be thought of as one-dimensional patterns, such as lines and object contours. This paper presents a novel and efficient algorithm that combines the advantages of these two approaches. We first note that exemplar-based texture synthesis contains the essential process required to replicate both texture and structure; the success of structure propagation, however, is highly dependent on the order in which the filling proceeds. We propose a best-first algorithm in which the confidence in the synthesized pixel values is propagated in a manner similar to the propagation of information in inpainting. The actual color values are computed using exemplar-based synthesis. In this paper, the simultaneous propagation of texture and structure information is achieved by a single, efficient algorithm. Computational efficiency is achieved by a block-based sampling process. A number of examples on real and synthetic images demonstrate the effectiveness of our algorithm in removing large occluding objects, as well as thin scratches. Robustness with respect to the shape of the manually selected target region is also demonstrated. Our results compare favorably to those obtained by existing techniques.
Article
This work aims to develop a novel function of the digital camera, i.e., region-filling after object removal from a digital photograph. This problem is defined as how to "guess" the Lacuna region after removal of an object by replicating a part from the remainder of the whole image with visually plausible quality. We propose a hybrid region-filling algorithm composed of a texture synthesis technique and an efficient interpolation method with a refinement approach: 1) the "subpatch texture synthesis technique" can synthesize the Lacuna region with significant accuracy; 2) the "weighted interpolation method" is applied to reduce computaion time; and 3) the "artifact detection mechanism" integrates the Kirsch edge detector and color ratio gradients to detect the artifact blocks in the filled region after the first pass of filling the Lacuna region when the result may not be satisfactory. This can lead to resynthesizing the artifact blocks without user intervention and can be compliant to the unsuccessful results of other algorithms. In the procedure of region-filling, color texture distribution analysis is used to choose whether the subpatch texture synthesis technique or the weighted interpolation method should be applied. In the subpatch texture synthesis technique, the actual pixel values of the Lacuna region are synthesized by adaptively sampling from the source region. The experimental results show that our proposed algorithm can achieve a better performance than previous methods. Particularly, the regular computation of our proposed algorithm is more suitable for implementation of the hardware in a digital camera.
Article
Full-text available
Abstract, Real world surfaces such as tree bark, moss, sponge, and fur often have complicated geometry that leads to effects such as self-shadowing, masking, specularity, and interre ection as the lighting or viewpoint in a scene changes. We use image based techniques to analyze and represent bidirectional texture functions, or BTFs, with correct geometric and lighting effects. A basis for the apparent BRDF of points on the surface is determined and used to compress the texture datasets, as well as provide a space for comparison of texture elements across all lights and views. The compression method reduces the approximately 10,000 images in each 6-D lighting, viewpoint, and spatial variation texture dataset to under 2 MB.
Article
Full-text available
In this work, we investigate the visual appearance of real-world surfaces and the dependence of appearance on the geometry of imaging conditions. We discuss a new texture representation called the BTF (bidirectional texture function) which captures the variation in texture with illumination and viewing direction. We present a BTF database with image textures from over 60 different samples, each observed with over 200 different combinations of viewing and illumination directions. We describe the methods involved in collecting the database as well as the importqance and uniqueness of this database for computer graphics. A related quantity to the BTF is the familiar BRDF (bidirectional reflectance distribution function). The measurement methods involved in the BTF database are conducive to simultaneous measurement of the BRDF. Accordingly, we also present a BRDF database with reflectance measurements for over 60 different samples, each observed with over 200 different combinations of viewing and illumination directions. Both of these unique databases are publicly available and have important implications for computer graphics.
Article
Full-text available
The bidirectional texture function (BTF) is a 6D function that can describe textures arising from both spatially-variant surface re-flectance and surface mesostructures. In this paper, we present an algorithm for synthesizing the BTF on an arbitrary surface from a sample BTF. A main challenge in surface BTF synthesis is the requirement of a consistent mesostructure on the surface, and to achieve that we must handle the large amount of data in a BTF sample. Our algorithm performs BTF synthesis based on surface textons, which extract essential information from the sample BTF to facilitate the synthesis. We also describe a general search strategy, called the-coherent search, for fast BTF synthesis using surface textons. A BTF synthesized using our algorithm not only looks similar to the BTF sample in all viewing/lighthing conditions but also exhibits a consistent mesostructure when viewing and lighting directions change. Moreover, the synthesized BTF fits the target surface naturally and seamlessly. We demonstrate the effectiveness of our algorithm with sample BTFs from various sources, including those measured from real-world textures.
Conference Paper
Full-text available
Mapping textured images on smoothly approximated surfaces is often used to conceal the loss of their real, fine-grained relief. A limitation of mapping a fixed texture in such cases is that it will only be correct for one viewing and one illumination direction. The presence of geometric surface details causes changes that simple foreshortening and global color scaling cannot model well. Hence, one would like to synthesize different textures for different viewing conditions. A texture model is presented that takes account of viewpoint dependent changes in texture appearance. It is highly compact and avoids copy-and-paste like repetitions. The model is learned from example images taken from different viewpoints. It supports texture synthesis for previously unseen conditions.
Article
Full-text available
Our noncausal, nonparametric, multiscale, Markov random field (MRF) model is capable of synthesizing and capturing the characteristics of a wide variety of textures, from the highly structured to the stochastic. We use a multiscale synthesis algorithm incorporating local annealing to obtain larger realizations of texture visually indistinguishable from the training texture
Conference Paper
Full-text available
In this paper, we present a novel approach to synthetically generating bidirectional texture functions (BTFs) of real-world surfaces. Unlike a conventional two-dimensional texture, a BTF is a sixdimensional function that describes the appearance of texture as a function of illumination and viewing directions. The BTF captures the appearance change caused by visible small-scale geometric details on surfaces. From a sparse set of images under different viewing /lighting settings, our approach generates BTFs in three steps. First, it recovers approximate 3D geometry of surface details using a shape-from-shading method. Then, it generates a novel version of the geometric details that has the same statistical properties as the sample surface with a non-parametric sampling method. Finally, it employs an appearance preserving procedure to synthesize novel images for the recovered or generated geometric details under various viewing/lighting settings, which then define a BTF. Our experimental results demonstrate the effectiveness of our approach. CR Categories: I.2.10 [Artificial Intelligence]: Vision and Scene Understanding---modeling and recovery of physical attributes I.3.7 [Computer Graphics]: Three-dimensional Graphics and Realism---color, shading, shadowing, and texture I.4.8 [Image Processing]: Scene Analysis---color, photometry, shading Keywords: Bidirectional Texture Functions, Reflectance and Shading Models, Texture Synthesis, Shape-from-Shading, Photometric Stereo, Image-Based Rendering.
Article
We extend the machinery of existing texture synthe- sis methods to handle texture images where each pixel contains not only RGB values, but reflectance functions. Like conventional texture synthesis methods, we can use photographs of surface textures as examples to base synthesis from. However multiple photographs of the same surface are used to characterize the surface across lighting variation, and synthesis is based on these source images. Our approach performs synthesis directly in the space of reflectance functions and does not require any intermediate D reconstruction of the target surface. The resulting synthetic reflectance textures can be rendered in real- time with continuous control of lighting direction. I. I NTRODUCTION
Conference Paper
We study the recognition of surfaces made from different materials such as concrete, rug, marble or leather on the basis of their textural appearance. Such natural textures arise from spatial variation of two surface attributes: (1) reflectance and (2) surface normal. In this paper, we provide a unified model to address both these aspects of natural texture. The main idea is to construct a vocabulary of prototype tiny surface patches with associated local geometric and photometric properties. We call these 3D textons. Examples might be ridges, grooves, spots or stripes or combinations thereof Associated with each texton is an appearance vector, which characterizes the local irradiance distribution, represented as a set of linear Gaussian derivative filter outputs, under different lighting and viewing conditions. Given a large collection of images of different materials, a clustering approach is used to acquire a small (on the order of 100) 3D texton vocabulary. Given a few (1 to 4) images of any material, it can be characterized using these textons. We demonstrate the application of this representation for recognition of the material viewed under novel lighting and viewing conditions
Conference Paper
A non-parametric method for texture synthesis is proposed. The texture synthesis process grows a new image outward from an initial seed, one pixel at a time. A Markov random field model is assumed, and the conditional distribution of a pixel given all its neighbors synthesized so far is estimated by querying the sample image and finding all similar neighborhoods. The degree of randomness is controlled by a single perceptually intuitive parameter. The method aims at preserving as much local structure as possible and produces good results for a wide variety of synthetic and real-world textures
Conference Paper
In this work, we investigate the visual appearance of real-world surfaces and the dependence of appearance on imaging conditions. We present a BRDF (bidirectional reflectance distribution function) database with reflectance measurements for over 60 different samples, each observed with over 200 different combinations of viewing and source directions. We fit the BRDF measurements to two recent models to obtain a BRDF parameter database. These BRDF parameters can be directly used for both image analysis and image synthesis. Finally, we present a BTF (bidirectional texture function) database with image textures from over 60 different samples, each observed with over 200 different combinations of viewing and source directions. Each of these unique databases has important implications for a variety of vision algorithms and each is made publicly available