Shape from texture and integrability
ABSTRACT We describe a shape from texture method that constructs a maximum a posteriori estimate of surface coefficients using both the deformation of individual texture elements-as in local methods-and the overall distribution of elements-as in global methods. The method described applies to a much larger family of textures than any previous method, local or global. We demonstrate an analogy with shape from shading, and use this to produce a numerical method. Examples of reconstructions for synthetic images of surfaces are provided, and compared with ground truth. The method is defined for orthographic views, but can be generalised to perspective views simply
-
Citations (0)
-
Cited In (0)
Page 1
Shape from texture and integrability
D.A. Forsyth
Computer Science Division
U.C. Berkeley
Berkeley, CA 94720
daf@cs.berkeley.edu
Abstract
Wedescribe ashapefromtexture methodthatconstructs
a maximum a posteriori estimateof surface coefficients us-
ing both the deformation of individual texture elements —
as in local methods — and the overall distribution of ele-
ments — as in global methods. The method described ap-
plies to a much larger family of textures than any previous
method, local or global. We demonstrate an analogy with
shape from shading, and use this to produce a numerical
method. Examples of reconstructions for synthetic images
of surfaces are provided, and compared with ground truth.
The method is defined for orthographic views, but can
be generalised to perspective views simply. Keywords:
Shape from texture, texture, computer vision, surface fit-
ting
There are surprisinglyfew methodsforrecovering asur-
face model from a projection of a texture field that is as-
sumed to lie on that surface. Global methods attempt to
recover an entire surface model, using assumptions about
the distribution of texture elements. Appropriate assump-
tions are isotropy [14] (the disadvantage of this method is
that there are relatively few natural isotropic textures) or
homogeneity [1, 2]. Methods based around homogeneity
assume that texels are the result of a homogenous Pois-
son point process on a plane; the gradient of the density of
the texel centers then yields the plane’s parameters. How-
ever, deformation of individual texture elements is not ac-
counted for.
Local methods recover some differential geometric pa-
rameters at a point on a surface (typically, normal and cur-
vatures). This class of methods, which is due to Gard-
ing [6], has been successfully demonstrated for a variety
of surfaces by Malik and Rosenholtz [10, 13]; a reformu-
lation in terms of wavelets is due to Clerc [3]. The method
has a crucial flaw; it is necessary either to know that tex-
ture element coordinate frames form a frame field that is
locally parallel around the point in question, or to know
the differential rotation of the frame field (see [7] for this
point, which is emphasized by the choice of textures dis-
played in [13]; the assumption is known as texture sta-
tionarity). For example, if one were to use these methods
to recover the curvature of a doughnut dipped in chocolate
sprinkles, it would be necessary to ensure that the sprin-
kles were all parallel on the surface (or that the field of
angles from sprinkle to sprinkle was known). As a result,
the method can be demonstrated to work only on quite a
small class of textured surfaces. A second, important, dif-
ficulty lies in the data recovered; these methods all make
local estimates of normal and curvature. But curvature is
a derivative of the normal; as a result, while one local es-
timate may be helpful, there is no reason to believe that a
collection of local estimates will be consistent. This is a
problem of integrability. Surface interpolation methods
have largely fallen out of fashion in computer vision, due
to the uncertainty regarding the semantic status of surface
patches inregionswhere dataisabsent. Shape fromtexture
is a problem where an interpolate has an unquestionably
useful role — it expresses the fact that, because one has
a prior belief that surfaces are relatively slowly changing,
incomplete local measurements of the surface normal can
constrain one another and lead to good global estimates of
the normal at some points.
We describe a shape-from-texture method that exploits
the deformation of individual texels, and spatial proper-
ties of the distribution of texels, to produce a global ap-
proximatingsurface. To do so, we introduce a new texture
model, which is a generalisationof that ofBlake and Mari-
nos [2]; we show how this model leads to a problem rather
like classical shape from shading; and then demonstrate a
variety of reconstructions resulting from this formulation.
1A Texture Model
We model a texture on a surface as an inhomogenous,
marked Poisson point process. A point process is some
random procedure that results in points lying on a sur-
face (exact definitionsinvolve tedious measure theory[5]).
Page 2
An inhomogenous Poisson process has the property that
there is some measure M such that the expected number
of points in a set U is given by E(#(U)) = M(U). We
confine our attention to cases where this measure can be
given by a density with respect to area, so that there is
some function λ(x) of position on the surface x such that
E(#(U)) =?
surface). The most familiar case is a homogenous Pois-
son process, where λ is a constant. We assume that we are
working with a parametric family of intensities, λ(x;w),
with parameter w. A marked point process is one where
each point carries a mark, drawn randomly according to
some mark density from an available collection (for exam-
ple, points might be red or blue).
In our model, the marks are texture elements (texels or
textons, as one prefers; e.g. [9, 8] for automatic methods
of determining appropriate marks) and the orientation of
those texture elements with respect to some surface coor-
dinate system. We assume that the marks are drawn from
some known, finite set of classes of Euclidean equivalent
texels. Each mark is defined in its own coordinate sys-
tem; the surface is textured by taking a mark, placing it on
the tangent plane of the surface at the point being marked,
translating the mark’s origin to lie on the surface point be-
ing marked, and rotating randomly about the mark’s ori-
gin (according to the mark distribution). We assume that
these texture elements are sufficiently small that they will,
in general, not overlap, and that they can be isolated. Fur-
thermore, we assume that they are sufficiently small that
they can be modelled as lying on a surface’s tangent plane
at a point.
We assume that we have an orthographicviewof a com-
pact smooth surface, and that the boundary in this view is
known. In section 4, we indicate what is required to relax
the assumption of an orthographic view. We assume the
viewing direction is the z-axis. Now consider one class of
texture element; each instance in the image of this class
was obtained by a Euclidean transformation of the model
texture element, followed by a foreshortening that in an
appropriate coordinate system on the surface and in the
image can be written as
Uλ(x)dA (where dA is the area form on the
Fi=
?
cosσi
0
0
1
?
where σ is the angle between the surface normal at mark
i and the z axis. The relevant coordinate system is the or-
thonormal coordinate system constructed around the slant
and tilt vectors for the tangent plane at the mark. Between
the i’th mark and the j’th mark, this coordinate system ro-
tates. In particular, consider the transformation Ti→j, tak-
ing the i’th mark in the image to the j’th mark in the im-
age. Ignoring the translation component, this transforma-
tion will have the form Ti→j = R1FjR2F−1
i
R3where
R represents a rotation and F is as above. In particular,
R3 rotates the image coordinate system to the slant-tilt
coordinate system at i, R2accounts for the in-plane ro-
tation of the texture element on the tangent plane and R1
rotates the new slant-tilt coordinate system to the image
coordinate system. It is straightforward that det(Ti→j) =
cosσj/cosσiNoticethat theadvantageofthisformulation
is that no assumption regarding the relative rotationof the
texture elements on the surface is required. To recover a
surface, we must: find the different texture elements; com-
pute estimates of these determinants; and fit an appropriate
surface model totheresultingdata set. We defer discussion
offindingthetextureelements andestimatingdeterminants
to section 3.1 and show next how a surface can be recov-
ered from this data.
2Fitting a maximum a posteriori surface to
the texture model
Wehave thefollowingdata: a familyofsurfaces S; a set
of N textons, which lie on a surface S which may or may
not belong to S; the image domain R to which this surface
projects; the image of each of these textons in R; the out-
line of S (we assume this is the boundary of R, ∂R); for
each pair of points, det(Ti→j); a parametric family of in-
tensities λ(x;w), describing the different possibilities for
the point process intensity; and the fact that S is smooth
and compact. We wish to recover a representation of the
visible component of S.
2.1The analogy with shape from shading
Tounderstandthisproblemfurther, assume thatN tends
to infinity, so that we have m(x,y) = cosσ(x)/cosσ(y),
for any two distinct points x,y ∈ R. Since S is compact,
there is at least one point y0in R such that cosσ(y0) =
1. We can identify this point (or these points), because
m(x,y0) is always smaller than m(x,y), for any y, x.
Now m(x,y0) = cosσ(x); this means that our problem is
exactly analogous with classical shape from shading with
the illuminationdirection in the viewing direction.
Typically, the shape from shading literature approaches
the reconstruction problem by representing S as a Monge
patch (i.e (x,y,f(x,y))), identifying the z component of
the unit normal with 1/m(x0,y), and transforming the re-
sulting partial differential equation to obtain f2
1/m2
∂f
∂y
no room for a detailed review of this old and well studied
problem.
Uniqueness: Oliensis shows that if a solution exists,
it is unique [11]. Note that points where g = 0 are par-
ticularlyimportant, because the gradient field has a zero at
these pointsThis uniquenessresult is proven by notingthat
(1) the characteristic strips grow from points where g = 0
x+ f2
y=
∂x+∂f
2=
?1/m(x,y0)?2− 1 = g(x) There is
Page 3
(but in a fashion that depends on the index of the underly-
ing zero of the gradient field, the sign of which cannot be
determined from g); (2) all characteristic strips approach
the boundarytransversally; (3) thisconstraint uniquelyde-
termines the sign of the indices of the gradient field; (4)
once the indices are known, the reconstruction is unique.
Thismeans thattheoutlinehas acuriousroleintheunique-
ness proof;onedoesnot needtosupplyan outlineprecisely
to prove uniqueness, but one does need to supply a closed
curve which is (a) guaranteed to enclose all points where
g = 0 and (b) guaranteed to have the characteristics (or,
equivalently, the gradient) cross the curve transversally at
every point.
Existence: Existence is significantly more difficult; it
appears that for generic m such that (a) 0 ≤ m ≤ 1 for all
points in R and (b) m = 0 for points on the boundary of
R, no solution exists.
Failure of the analogy: Apart from the fact that we
wish to deal with small collections of texture elements,
there are two important reasons that it is unwise to simply
use a shape from shading method:
Estimation: shape fromshadingmethodstransformthe
variational criterionfrom?(ˆ nz−m)2dA to the form given
above, to obtain a simpler PDE (the square-root in the unit
normal is cleared by this transformation). However, the
twocriteriaare not equivalent: they correspondtodifferent
noise models, and they lead to different minima, unless the
value at the minimum is zero. Since we shall assume that
m is subject toadditiveGaussian noise, we cannot perform
this transformation and still find a statistically meaningful
solution.
Boundary conditions: the use of a Monge patch is not
innocuous. In particular, Oliensis’uniquenessresult shows
the importance of the boundary (uniqueness can be shown
only given the position of the boundary). But a Monge
patch is badly behaved at the boundary; the gradient of the
function must be infinite there. Any variational method
should consider only functions with infinite gradient at the
boundary — which, by the smoothness term, will sharply
restrict the behaviour of the function within the domain.
However, it is technically difficult to construct a family of
functions that is guaranteed to turn away from the view at
the boundary points. Furthermore, the approximation of
curvature by second derivatives must fail near the bound-
ary, meaning that the smoothness term is difficult to work
with precisely at the points where it is most needed.
2.2 A likelihood function
We assume that our measurements of dij = det(Ti→j)
are subject to additive Gaussian noise, with constant stan-
dard deviation σd. While this is a somewhat dubi-
ous noise model, it is not outrageous and it leads to a
tractable problem. Now we have a series of measurements
d = (d00,...,dij,...) corrupted with zero mean, additive
Gaussian noise. They are conditionallyindependent given
the surface, and so yield a log-likelihood
−logP(dij|z) =
1
2σ2
d
?
ij
(dij−ˆ nz(xi;z)
ˆ nz(xj;z))2
+ K
where K is some normalising constant. This likelihood
incorporates all information obtainable from the point-to-
pointdeformationoftextureelements inviewing,intheab-
sence of rigid constraints on element-to-element rotation.
There is a second source of information about the sur-
face; we expect thattextureelements are distributedaround
the surface in a particular manner, consistent with an inho-
mogenous Poisson point process. This can be expressed
by lookingat the quadrat counts. In particular, we subdi-
vide the parameter domain into a fixed collection of non-
intersectingsets (called quadrats by analogywith the pro-
cedure forpointprocesses ontheplane [4]). Notice thatthe
natural alternative — the K-statistic,which is essentially, a
histogramofthenumber ofpointswithincircles ofincreas-
ing radius [12] — is very difficult to use for a curved sur-
face, as it requires solving geodesic equations. The count
of pointsin each quadrat is multinomial,given the number
of points N, any particular choice of surface z and of in-
tensity function λ(x;w). We write the i’th quadrat as qi,
and write
?
?
In particular, we have
pi=
qiλ(x;w)dA
?
j
qiλ(x;w)dA
P(#(q1) = n1,#(q2) = n2,...|z,w,N) =
?n1!n2!...
N!
?
pn1
1pn2
2...
These quadrat countsare a discrete and generalisedversion
ofthe process intensitycue used byBlakeand Marinos[2].
Now the quadrat counts — which measure the extent to
which the process intensity implied by the reconstruction
corresponds to the intensity chosen — are independent of
the normal measurements given the surface and the inten-
sity. This means that we have a log-likelihood
logP(measurements|surface, intensity, no. of points) =
logP(dij|z) + logP(#(q1),#(q2),...|z,w)+ K
2.3Priors
Our prior will be non-zero except for a family of sur-
faces (whichwe describe insection3.2). We requireaprior
to enforce smoothness on z. However, it is unwise to use
the second derivative approximation to curvature, because
Page 4
this is very badly behaved at the boundary (where the sur-
face is nearly vertical). Instead, we compute the norm of
the shape operator and sum over the surface. We must now
choose a surface to maximise the negative log posterior,
given by
1
2σ2
d
?
ij
(dij−ˆ nz(xi;z)
ˆ nz(xj;z))2
+ K +
1+ κ2
(
1
2σ2
k
)
?
R
(κ2
2)dA +
log
??n1!n2!...
N!
?
pn1
1pn2
2...
?
+ logπ(w)
This posterior may, at first glance, look ambiguous; in
particular, we have only ratios of cosines, not the cosines
themselves. This ambiguity does not manifest itself in
practice, because of the quadrat count term. Assume that
we obtain cosines from the ratios by choosing the value
of the maximum cosine (i.e. the angle between the most
frontal textureelement and theview direction). If thevalue
of this choice is too low, the surface will (generally) bulge
out toward the view; in turn, this means that the area rep-
resented by some quadrats will increase significantly, and
the value of the quadrat term will decline sharply.
3
3.1
Implementation
Estimating Texture Foreshortening
Finding texture elements is an established technology
(e.g. [8, 9]). For our demonstration of this reconstruc-
tionmethod, we assume that thelocationsand approximate
scales of textureelements fromeach class are known. Now
in this case, we need to estimate the relative foreshorten-
ing, det(Ti→j). One possibility is to construct an affine
transformation from element i to element j that minimises
(say) the sum of least-squares errors — perhaps using the
iterative method of Malik and Rosenholtz [10] — and take
the determinant of that transformation. In fact, because
we care only about the relative foreshortening, it is un-
necessary to use this relatively elaborate (and expensive)
method.
In particular, we note that for a function f, a region
R and an affine transformation T ,
(1/|T |)?
rather quickly, so that it is approximately zero close
to the boundaries of both R and T (R), we have that
?
suggests integrating with respect to a weight function,
which declines to zero quickly outside a small domain.
We use this approximation, which can be shown to be suc-
cessful over a satisfactory range of values. Note that this
?
T (R)f(T (x))dx =
Now if f falls to zero
T (R)f(T (x))d(T x).
T (R)f(T (x))dx ≈ (1/|T |)?
(R)f(T (x))d(T x). This
method requires that texture elements be separated, some-
thing that cannot be guaranteed with a Poisson model; be-
cause the texture elements are small, the bias resulting
from approximating the hard-core model required with a
Poisson model is tiny.
Note that it is difficult to estimate det(Ti→j) for either i
or j at a substantialinclinationto the frontal direction. The
value will be either very large, or very small; and, since
one of the texture elements involved will cover relatively
few pixels, it will be particularly difficult to estimate. It
is not always possible to tell when this estimation error is
likelytooccuras theremay be regionsinthe interiorwhere
the inclination is large, but this error is pretty much guar-
anteed to occur close to the rim. This suggests that texture
elements close to the rim should be ignored. We describe
the method for doing so below.
3.2An appropriate family of surfaces
The success of thisapproach depends sharplyonthe use
of an appropriate family of surfaces. In particular, it is im-
portant to ensure that any surface considered turns away
from the eye at the boundary. This requirement substan-
tially reduces the available ambiguities (e.g. see Oliensis’
uniquenessproof[11]). It is, inpractice, relativelystraight-
forwardtoensurethat thiscriterionis met. We describe our
method for a circular boundary; the extension to any Jor-
dan curve is straightforward;an extension to any boundary
is technically more tricky, but appears tractable. We will
work with parametric surfaces. Given a circular boundary,
it is natural to use polar coordinates. We write our sur-
face as (p(ρ)cosθ),p(ρ)sinθ,z(ρ,θ)) and the surface’s
unit normal as ˆ n = (ˆ nx, ˆ ny, ˆ nz). The boundary occurs
at ρ = 1. Now if p(0) = 0, p(1) = 1, dp/dρ(1) = 0,
and p′(ρ) > 0 for 0 ≤ ρ < 1, we have that ˆ nz(1,θ) = 0,
independent of our choice of z or θ. This means that this
surface has the curve (cosθ,sinθ) as a component of its
outline.
By this construction we have eliminated a substantial
number of ambiguities, because we are confining our at-
tention to surfaces whose outline lies on ∂R, something
one cannot do witha Monge patch representation. We now
represent z as an element of a linear family, whose basis is
given by a tensor product. In particular, we model z =
?
for all i, we ensure a properly defined surface normal at
ρ = 0. We must also require that ∂z/∂ρ does not change
sign along the curve ρ = 1 (or else the surface will be
singular at the points where this expression changes sign).
3.3Numerical Methods
klaklφk(ρ)ψl(θ). By requiring that ∂φk/∂ρ(0) = 0
Efficiency: theoriginalformofthe posteriorleads toan
extremely slow method, because there are N2ratio terms
for N texture elements in the surface orientation term.
Most of these terms are not independent. Instead, we will
Page 5
-1
-0.5
0
0.5
1
-1
-0.5
0
0.5
1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
-0.2-0.15-0.1 -0.0500.05 0.10.15
0
10
20
30
40
50
60
70
-1
-0.5
0
0.5
1
-1
-0.5
0
0.5
1
0
0.2
0.4
0.6
0.8
-0.06 -0.04-0.0200.020.04 0.060.080.1
0
10
20
30
40
50
60
70
Figure 1: Reconstruction examples for two surfaces with known structure. The figures on the top are rendered images of
textured surfaces. On the left, the texture elements are circles, so that rotation of a texture element is moot. On the right,
the texture elements are squares, and are rotated randomly. No known local shape from texture method can deal with
such an example, without being informed in advance of the rotation component. The figures in the center show views of
the reconstructed surfaces. The original surfaces are unit radius hemispheres. The reconstructed surfaces are guaranteed
to pass through the point (0, 0, 0) at the center of the domain. We can compare the reconstruction with ground truth by
computing the distances between points on the reconstructed surface and the point (0, 0, 1) (which is the center of the
original sphere). The figures on the bottom are histograms of the distances for the reconstructed surfaces offset by the
mean, suggestingthat they are strongly similar to the original surface (the distance error lies withina range of rather less
than 10% of the radius).
choose sito minimise
?
i,j
(dij−si
sj)2
and then maximise
−
1
2σ2
d
?
ij
(si
sj
−ˆ nz(xi;z)
ˆ nz(xj;z))2
+ K
−(
1
2σk)
?
R
(κ2
1+ κ2
2)dA −
log
??n1!n2!...
N!
?
pn1
1pn2
2...
?
+ logπ(w)
Excising the rim: we ignore all texture elements for
which r > 0.95, so as to avoid very poornormal estimates.
This means that we have no measurement of the geometry
of the surface for a striparound the boundary. Now we can
benefit from the nature of the involvement of the boundary
inOliensis’proofhere (see section2.1)ifwe can guarantee
that (a) thereare nofrontalpointsofthe surface inthisstrip
and (b) that the characteristics cross the internal boundary
of this strip transversally. The first is a restriction on the
surfaces that can be reconstructed, but a reasonable one.
The second is achieved by insistingthat ∂z/∂ρ > 0 inside
this strip.
Constraints: requiring that ∂z/∂ρ is positive for ρ >
ρ0 is a set of linear constraints on akl; we can evaluate
ψl(θ) at a discrete set of points (ρp,θp) and use the linear
constraints
?
kl
akl∂φk
∂ρ(ρp)ψl(θp)>0
Ca
>0
-1
-0.5
0
0.5
1
-1
-0.5
0
0.5
1
-1
-0.5
0
0.5
1
1.5
Figure 3: Local minima exist for this method, and it must
be started at random toavoid them. This figure shows a lo-
cal minimum for the case of the sphere with square texture
elements infigure 1; the value of the negative logposterior
is about 1.5 times the value for the correct reconstruction.
andwe need tomaximise the negativelog-posteriorsubject
to these constraints.
4Discussion
Maximisation: the resulting problem is difficult. For a
reasonable basis (four φkand five ψl, leading to a total of
20elements), it can be handledby Matlab’sconstrfunc-
tion, if thisis started at a feasible point. There are clearly a
variety of local minima, some of which are illustrated be-
low. These can be dealt with by running the method at a
selection of randomly chosen feasible start points; in our
experience, of the order of five starts usually leads to a sat-
isfactory reconstruction. It is wise to include ψl(θ) = 1
in the basis to allow for surfaces with rotational symmetry,
and to constrain φkand ψlto ensure that the point ρ = 0
has a properly defined surface normal.
Homogeneity: we have implemented this method for
the restricted case of an homogenous Poisson distribution.