Content uploaded by Timothy Oleskiw
Author content
All content in this area was uploaded by Timothy Oleskiw on Jun 11, 2018
Content may be subject to copyright.
Content uploaded by Timothy Oleskiw
Author content
All content in this area was uploaded by Timothy Oleskiw on Jun 11, 2018
Content may be subject to copyright.
On Growth and Formlets: Sparse Multi-Scale Coding of
Planar Shape
James H. Eldera,∗, Timothy D. Oleskiwb, Alex Yakubovicha, Gabriel Peyr´ec
aCentre for Vision Research, York University, Toronto, Canada
bDepartment of Applied Mathematics, University of Washington, Seattle, WA, United States
cCEREMADE, Universit´e Paris-Dauphine, Paris, France
Abstract
We propose a sparse representation of 2D planar shape through the composi-
tion of warping functions, termed formlets, localized in scale and space. Each
formlet subjects the 2D space in which the shape is embedded to a localized
isotropic radial deformation. By constraining these localized warping transfor-
mations to be diffeomorphisms, the topology of shape is preserved, and the set
of simple closed curves is closed under any sequence of these warpings. A gener-
ative model based on a composition of formlets applied to an embryonic shape,
e.g., an ellipse, has the advantage of synthesizing only those shapes that could
correspond to the boundaries of physical objects. To compute the set of formlets
that represent a given boundary, we demonstrate a greedy coarse-to-fine formlet
pursuit algorithm that serves as a non-commutative generalization of matching
pursuit for sparse approximations. We evaluate our method by pursuing par-
tially occluded shapes, comparing performance against a contour-based sparse
shape coding framework.
Keywords: planar shape, deformation, sparse coding sep contour completion
1. Introduction
Shape information is important for a broad range of computer vision prob-
lems. For some detection and recognition tasks, discriminative models that use
non-invertible shape codes (e.g., [1]) can be effective. However, many other
tasks call for a more complete generative model of shape. Examples include:
(1) shape segmentation, recognition, and tracking in cluttered scenes, where
∗Corresponding author.
Email addresses: jelder@yorku.ca (James H. Elder), oleskiw@uw.edu (Timothy D.
Oleskiw), yakuboa@yorku.ca (Alex Yakubovich), gabriel.peyre@ceremade.dauphine.fr
(Gabriel Peyr´e)
URL: www.yorku.ca (James H. Elder),
http://depts.washington.edu/amath/people/Timothy.Oleskiw (Timothy D. Oleskiw),
elderlab.yorku.ca/∼alex (Alex Yakubovich), http://www.ceremade.dauphine.fr/∼peyre
(Gabriel Peyr´e)
Preprint submitted to Image and Vision Computing January 4, 2013
shapes must be distinguished not just from each other, but from ‘phantom’
shapes formed by conjunctions of features from multiple objects [2]; (2) model-
ing of shape articulation, growth, and deformation; and (3) modeling of shape
similarity.
Our paper concerns the generative modeling of natural 2D shapes in the
plane, represented by their 1D boundary. We restrict our attention to simply-
connected shapes whose boundaries are smooth, simple, and closed curves. We
seek a generative shape model that satisfies a set of properties that seem to us
essential:
1. Completeness. The model can produce all shapes.
2. Closure. The set of valid shapes is closed under the generative model. In
other words, the model generates only valid shapes.
3. Composition. Complex shapes are generated by combining simpler com-
ponents.
4. Sparsity. Good approximations of shape can be generated with relatively
few components.
5. Progression. Approximations can be improved by incorporating more
components.
6. Locality. Components are localized in space.
7. Scaling. Components are tuned to specific scales and are self-similar over
scale.
8. Region & Contour. Components can capture both region and contour
properties in a natural way.
The need for completeness is self-evident if the system is to be general.
Closure is critical if we hope to capture the statistics of natural shape in a set of
hidden generative variables. Without closure, heuristics must be used to avoid
the generation of invalid shapes, e.g., bounding contours with self-intersections.
Aside from the resulting inefficiency, this creates a discrepancy between the
statistical structure encoded by the model, and samples the model produces. In
other words, the model cannot fully capture the statistics of natural boundaries.
Composition (here we use the word in a general sense) is important if we are
to handle the richness and complexity of natural shapes while maintaining con-
ceptual simplicity. Given the high dimensionality of natural shapes, sparsity is
necessary in order to store shape models [3]. Sparsity also implies that essential
shape features have been made explicit [4]. Progression allows the complexity
of the model to be matched to the difficulty of the task, facilitating real-time
operation and coarse-to-fine optimization.
Locality is a natural goal, since a first-order property of natural images
is local coherence. Nearby points on the surface of an object tend to have
similar reflectance, attitude, and illumination. Locality also allows for greater
robustness to occlusion, since components are more likely to be either entirely
visible or removed altogether rather than distorted. Scaling allows invariance
over object size, and allows shape features of different sizes to be captured
separately.
2
Finally, it has long been recognized that planar shape description requires
attention to both region and contour properties [3]. Some shape properties,
e.g., curvature, are naturally described by the bounding contour. Others, e.g.,
necks, are best described as region properties, since they involve points that are
proximal in the image but distant along the contour. A good generative model
will allow both to be encoded in a natural way.
We begin by reviewing prior models, with an eye to each of these essential
properties.
2. Prior Work
Early models that used chain coding or splines to encode shapes were not
generative and failed to succinctly capture global properties of shape. Fourier
descriptor, moment, and PCA bases have the potential to be generative, but
since all components are global, they are not robust to occlusion or local defor-
mation [5, 3, 6]. For these reasons, most modern approaches attempt to capture
structure at intermediate scales, or over a range of scales. Most of these models
can be crudely partitioned into two classes: contour-based and symmetry-based.
2.1. Contour-Based Models
Attneave [4] pointed to the concentration of information in the curvature of
the bounding contour, and suggested the potential for sparse descriptions based
on points of extremal curvature magnitude. Hoffman & Richards [7] linked cur-
vature to the part structure of shapes, proposing that parts are perceptually
segmented at negative minima of curvature. Mokhtarian & colleagues empha-
sized the encoding of curvature inflections across scale space for the purpose of
shape recognition [8].
While none of these early models are generative, Dubinskiy & Zhu [9] have
more recently proposed a contour-based shape representation that is both gen-
erative and sparse. The theory is based upon the representation of a shape by
a summation of component shapelets. A shapelet is a primitive curve defined
by Gabor-like coordinate functions that map arclength to the image, which can
be represented by the complex plane.
Specifically, a shapelet γ(t;σ, µ) is a mapping of arc length t∈[0,1] to the
image, represented by the complex plane. Each shape let is parameterized by
an arc length position parameter µand a scale parameter σ, and has the specific
form:
γ(t;σ, µ) = exp −(t−µ)2
2σ2cos 2π
σ(t−µ)+isin 2π
σ(t−µ).(1)
Figure 1 shows the coordinate functions and trace of an example shapelet. Note
that the planar curves generated by γ(t;σ, µ) are identical on t∈Rup to a
linear reparameterization, i.e., they are self-similar. However, these functions
are only approximately self-similar on any finite domain over which a curve will
3
be defined. Also, note that γdoes not in general generate a simple closed curve.
In fact, as σ→0, the number of sinusoidal periods on the interval t∈[0,1]
explodes, generating an infinite number of self-intersections.
y(t)
x(t)
t
0.20.40.60.8
−0.1
−0.05
0
0.05
0.1
(a) Component functions
y(t)
x(t)
−0.05 00.05 0.1
−0.1
−0.05
0
0.05
0.1
(b) Generated image trace
Figure 1: An example shapelet.
Shifting and scaling shapelets over arclength produces a basis set sufficient
to generate arbitrarily complex shapes. In particular, a K-shapelet curve ΓK(t)
can be defined as:
ΓK(t) = z0+
K
X
k=1
Akγ(t;σk, µk),(2)
4
where the 2 ×2 matrix Akapplies an affine transformation to each shapelet in
image space prior to linear combination.
Dubinskiy & Zhu’s shapelet model has many positive features. Components
are localized, albeit only in arclength, and scale is made explicit in a natural way.
However, like all contour-based methods, the shapelet theory does not explicitly
capture regional properties of shape. Perhaps most crucially, the model does
not respect the topology of object boundaries: sampling from the model will in
general yield non-simple, i.e., self-intersecting, curves (Figure 2). This violates
the closure criterion identified in Section 1.
Figure 2: Sampling from the shapelet model generally yields non-simple curves.
2.2. Symmetry-Based Models
Blum and colleagues [10, 11] introduced the symmetry axis representation
of shape in which a planar shape is represented by a 1D skeleton function and
associated 1D radius function. The symmetry axis representation led to re-
lated representations [12] which found application in medical imaging and other
domains.
Subsequent work incorporated notions of scale and time with symmetry axis
descriptions. Leyton [13] related symmetry axis descriptions to causal defor-
mation processes acting upon prototype shapes. In this view, symmetry axes,
terminating at curvature extrema on the boundary, are understood as records
of these deformation processes. Subsequent work on curve evolution methods
and shock-graph representations [14, 15] has provided a more complete theory
of region-based shape representation that has been broadly applied.
Despite the many appealing features of symmetry axis and shock-graph rep-
resentations, these methods, in general, are not sparse. In fact, the description of
5
each shape typically requires more storage, and little emphasis has been placed
on making symmetry axis representations generative [3]. Recent work of Trinh
and Kimia exploring generative and sparse models based upon shock graphs
comes some way in overcoming these limitations [16]. However, the constraints
required to enforce the closure property, i.e., topological constraints, are fairly
complex, and the full potential of the theory has yet to be explored.
A related approach to shape representation (e.g., [17, 18] employs finite
element modelling techniques to code the bounding contour in terms of the
free vibration modes of the shape, which are said to correspond to the object’s
generalized axes of symmetry. The main difficulty in developing this approach
into a generative model is that points on the boundary are coupled only locally
in the intrinsic coordinates of the shape boundary, thus nothing constrains the
topology of generated shapes.
2.3. Hybrid Approaches
Recognizing the merits and limitations of both contour-based and symmetry-
based approaches, Zhu [19] developed an MRF model for natural 2D shape, em-
ploying a neighbourhood structure that can directly encode both contour-based
and region-based Gestalt principles. The theory is promising in many respects.
It is generative, providing an explicit probabilistic model, and it captures both
region and contour properties. It is not sparse, however, and because the un-
derlying graph is lifted from the image plane, there is nothing in the model
that encodes the topological constraint that the boundary be simple, i.e., non-
intersecting. Instead, when sampling from the model, a ‘firewall’ is employed
to prevent intersections. Again, this is inefficient, and it also creates a dis-
connect between the generative variables encoding the model and the sampling
distribution.
2.4. Coordinate Transformations
A different class of model that could also be called region-based involves the
application of coordinate transformations of the planar space in which a shape
is embedded. This idea can be traced back at least to D’Arcy Thompson, who
considered specific classes of global coordinate transformations of the plane to
model the relationship between the shapes of different animal species [20]. In
the field of computer vision, Jain et al. [21] were among the first to extend this
idea to more general deformations with a complete Fourier deformation basis
that they used to match observed shapes to stored prototypes. However, this
Fourier basis fails to satisfy the locality property, and as a potential genera-
tive model it does not satisfy the closure property: random combinations of
Fourier deformation components will not in general preserve the topology of the
prototype curve.
More recently, Sharon & Mumford [22] have explored conformal mappings as
global coordinate transformations between planar shapes. However, although
the Riemann mapping theorem guarantees that any simple closed curve can
6
be conformally mapped to the unit circle, conformal mappings do not in gen-
eral preserve the topology of embedded contours. Hence, despite the compu-
tational constraints imposed by the Cauchy-Riemann equations, we again have
the problem that the set of valid bounding contours is not closed under these
transformations, making generative modeling difficult.
2.5. Localized Diffeomorphisms: Formlets
In considering prior generative shape models, the goal that seems most elu-
sive is that of closure: ensuring that the model generates only valid shapes. Our
approach originates with the observation that, while general smooth coordinate
transformations of the plane will not preserve the topology of an embedded
curve, it is straightforward to design a specific family of diffeomorphic transfor-
mations that will. It then follows immediately by induction that a generative
model based upon arbitrary sequences of diffeomorphisms will satisfy the closure
property.
In this paper we specifically consider a family of diffeomorphisms we call
formlets. A formlet is a simple, isotropic, radial deformation of planar space
that is localized within a specified circular region of a selected point in the plane.
The family comprises formlets over all locations and spatial scales. While the
gain of the deformation is also a free parameter, it is constrained to satisfy a
simple criterion that guarantees that the formlet is a diffeomorphism. Since
topological changes in an embedded figure can only occur if the deformation
mapping is either discontinuous or non-injective, these diffeomorphic deforma-
tions are guaranteed to preserve the topology of embedded figures. Thus the
model satisfies the closure property.
By construction, formlets satisfy the desired locality and scaling proper-
ties. It is straightforward to show that the model also satisfies the composition,
completeness, and progression properties in that an arbitrary shape can be ap-
proximated to increasing precision by composing an appropriate sequence of
localized formlets. Since each formlet may be centered either near the contour,
near a symmetry axis, or at any other location in the plane, the model has the
potential to capture both region and contour properties directly.
Our formlet model is closely related to recent work by Grenander et al.
[23], modeling changes to anatomical parts over time. Their representation,
called Growth by Random Iterated Diffeomorphisms (GRID), models growth
as a sequence of local and radial deformations. They demonstrate their model
by tracking growth in the rat brain, as revealed in sequential planar sections of
MRI data.
In the present paper we explore the possibility that these ideas could be
extended to model not just differential growth between sequential shapes, but
to serve as the basis for a generative model over the entire space of smooth
shapes, based upon a universal embryonic shape in the plane such as an ellipse.
Elements of the present paper were first reported at CVPR [24]. The main
contributions of this conference paper were:
1. We illustrated the completeness and closure properties of the formlet
model through random generation of sample shapes.
7
2. To solve the inverse problem of modeling given shapes, we developed and
applied a generalization of matching pursuit, which selects the sequence of
formlets that minimizes approximation error. We demonstrated that this
formlet pursuit algorithm allows for progressive approximation of shape,
while preserving topological properties.
3. We assessed the robustness of the formlet model to occlusion by evaluat-
ing it on the problem of contour completion. We found that the model
compares favourably with the contour-based shapelet model [9] on this
important problem.
In the present paper we elaborate substantially on these contributions, in-
cluding full derivations and complete implementation details. But we also build
on this work with several important new contributions:
1. We introduce a method for handling analytically computed optimal gain
values that exceed the diffeomorphism bounds.
2. We develop and evaluate an improved parameter optimization method
called dictionary descent, and show that it increases accuracy by 11% and
decreases run time by 42%, relative to standard dictionary pursuit.
3. We provide derivations for the Jacobian required for this new dictionary
descent method.
4. We develop, evaluate and compare several alternative mathematical for-
mulations of the formlet function.
5. We report statistics of formlet model parameters for our database of ani-
mal shapes, demonstrating coarse-to-fine scaling properties and an inter-
esting anisotropy in the location distribution.
3. Formlet Coding
3.1. Formlet Bases
We represent the image in the complex plane C, and define a formlet f:C→C
to be a diffeomorphism of the complex plane localized in scale and space. Such
a deformation can be realized by centering fabout the point ζ∈Cand allow-
ing fto deform the plane within a (σ∈R+)-region of ζ. Our Gabor-inspired
deformation is defined as
f(z;ζ, σ, α) = ζ+z−ζ
|z−ζ|ρ(|z−ζ|;σ, α),where
ρ(r;σ, α) = r+αsin 2πr
σexp −r2
σ2.
(3)
Thus each formlet f:C→Cis a localized isotropic and radial deformation
of the plane at location ζand scale σ. The magnitude of the deformation
is controlled by the gain parameter α∈R. Figure 3 demonstrates formlet
deformations of the plane with positive and negative gain.
8
(a) Expansion (α > 0) (b) Compression (α < 0)
Figure 3: Example formlet deformations. The location of the formlet ζis indicated by the
asterisk.
3.2. Diffeomorphism Constraint
Without any constraints on the parameters, these deformations, though con-
tinuous, can fold the plane on itself, changing the topology of an embedded con-
tour. In order to preserve topology, we must constrain the gain parameter to
guarantee that each deformation is a diffeomorphism. As the formlets defined
in Equation 3 are both isotropic and angle preserving, it is sufficient to require
that the radial deformation ρbe a diffeomorphism of R+, i.e., that ρ(r;σ, α) be
strictly increasing in r:
∂
∂r ρ(r;σ, α)>0
⇒α∂
∂r sin 2πr
σexp −r2
σ2>−1
⇒2α
σexp −r2
σ2πcos 2πr
σ−r
σsin 2πr
σ>−1
(4)
For α < 0, it is easy to see that the minimal slope of ρis attained as r→0+.
Evaluating Equation 4 at r= 0 thus yields the lower-bound on the gain α:
α > −σ
2π.(5)
For positive α, the location of the minimum in ρ0(r) does not have a closed
form solution, but can be computed numerically:
α/0.1956σ. (6)
Thus the diffeormorphism constraint is:
α∈σ−1
2π,0.1956.(7)
9
By enforcing this constraint, we guarantee that the formlet f(z , ζ, σ, α) is a
diffeomorphism of the plane. Hence, such a formlet acting on a curve embedded
in the plane will be a homeomorphism. In particular, let Γ be the continuous
mapping
Γ : [0,1] →C.(8)
Recall that Γ is simple if the mapping is injective, and closed by permitting the
equality Γ(0) = Γ(1). Since a formlet fsatisfying Equation 7 is bicontinuous,
if Γ is simple and closed, the deformed curve
Γf(t) = f(Γ(t)) (9)
will also be simple and closed.
Figures 4(a) and (b) show the radial deformation function ρ(r;σ, α) as a
function of rfor a range of gain αand scale σvalues respectively. Figures 4(c)
and (d) show the corresponding trace of the formlet deformation of an ellipse
in the plane.
ρ
r
(a) ρwith gain
variation
ρ
r
(b) ρwith scale
variation
y
x
(c) fwith gain
variation
y
x
(d) fwith scale
variation
Figure 4: Formlet transformations as a function of scale and gain. Dashed lines denote invalid
formlet parameters outside the diffeomorphism bounds of Equation 7.
3.3. Formlet Composition
The power of formlets is that they can be composed to produce complex
shapes while preserving topology. We define the forward formlet composition
problem as follows. Given an embryonic shape Γ0(t) and a sequence of K
formlets {f1. . . fK}drawn from a formlet dictionary D, determine the resulting
deformed shape ΓK(t). The problem is well-posed because the set of simple
closed curves is closed under formlet deformation: multiple formlets can be
composed to generate complex shape transformations. Thus,
ΓK(t)=(fK◦fK−1◦ · · · ◦ f1)(Γ0(t)).(10)
Figure 5 shows an example of forward composition from a circular embryonic
shape, where the formlet parameters ζ , σ, and αhave been randomly selected.
Note that a rich set of complex shapes is generated without leaving the space
of valid shapes (simple, closed contours).
A more difficult but interesting problem is inverse formlet composition: given
an observed shape Γobs(t), determine a sequence of Kformlets {f1. . . fK}, drawn
10
Figure 5: Shapes generated by random formlet composition over the unit circle. The first
two rows show the result of applying 5 successive random formlets. The asterisk and circle
indicate formlet location ζand scale σ, respectively. The bottom row shows some example
shapes produced from the composition of many random formlets.
from a formlet dictionary D, that best approximate Γobs(t) by minimizing some
reconstruction error ξ. Here we measure error as the L2norm of the residual:1
ξ(Γobs,ΓK) = Γobs(t)−ΓK(t)
2
2
=Z1
0Γobs(t)−ΓK(t)(Γobs (t)−ΓK(t))dt.
(11)
1For notational simplicity, we treat contours as continuous functions of arc length t. In
practice, we represent contours as 128-point vectors. All integrals map to summations in a
straightforward manner.
11
4. Formlet Pursuit
4.1. Dictionary Method
As a first attempt to estimate the optimal formlet sequence {f1. . . fK}, we
propose a version of matching pursuit for sparse approximation [25], replacing
the linear summation of elements by a non-commutative composition of formlet
components. Algorithm 1 shows the flow of the formlet pursuit algorithm.
Algorithm 1: Formlet Pursuit of Γobs.
Initialization: define Γ0=AΓ0+z0to be a best matching ellipse
approximating Γobs
for k= 1, . . . , K do
Optimal Formlet: compute maximal error reducing transformation
fk= argmin
f∈D
ξ(Γobs, f (Γk−1))
Update Approximation: apply optimal formlet
Γk=fk(Γk−1)
Initialization. Given an observed target shape Γobs, we initialize the model as
a 128-point polygon sampling the unit circle, and form a 1:1 correspondence
between the model and target points that remains fixed throughout pursuit.
We next apply an affine transformation to the model to generate an embryonic
elliptical shape Γ0minimizing the L2error ξΓobs, f(Γ0).
Formlet Selection. At iteration kof the formlet pursuit algorithm, we select
the formlet fk(z;ζk, σk, αk) that, when applied to the current model Γk−1,
maximally reduces the approximation error:
fk= argmin
f∈D
ξ(Γobs, f (Γk−1)).(12)
This is a difficult non-convex optimization problem, and experimentation
with gradient descent methods has shown that the formlet parameter space can
have many local minima. One saving grace is that the formlet transformation
is linear with respect to the gain α, allowing αto be recovered analytically.
Specifically, consider an alternative but equivalent representation of the formlet
described by Equation 3:
f(z;ζ, σ, α) = z+α·g(z−ζ , σ) where
g(zζ;σ) = zζ
|zζ|sin 2π|zζ|
σexp −|zζ|2
σ2.(13)
In Appendix Appendix A we show that, if we fix both the formlet location
ζand scale σ, the optimal unconstrained gain α∗for formlet fkis given by
12
α∗=Γobs −Γk−1, g Γk−1−ζ;σdt
g(Γk−1−ζ;σ)
2
2
.(14)
where h·,·i is the inner product on functions f: [0,1] →Cgiven in equation
A.3.
One complication is that Equation 14 may yield a gain value α∗that does
not satisfy the diffeomorphism constraint given by Equation 7. However, from
Equations 3 and 11 it can be seen that the error is a quadratic function of the
gain α. Thus the optimal constrained gain α∗
cfor given ζand σparameters is
simply the optimal unconstrained gain α∗expressed by Equation 14, thresholded
by the diffeormorphism constraints:
α∗
c=
αlfor α∗< αl
α∗for αl≤α∗≤αu
αufor α∗> αu,
(15)
where
αl=−(2π)−1σ
and
αu≈0.1956σ.
Thus search for the optimal formlet can proceed by sampling from a dictio-
nary over location ζand scale σparameters, computing the optimal constrained
gain α∗
cin each case, and then selecting the resulting formlet that yields mini-
mum error.
Figure 6 shows an example of formlet pursuit with this dictionary on an
example animal shape.
Figure 6: Formlet pursuit of an example animal shape. We first show the initial unit
circle, followed by the least-squares ellipse embryo Γ0(t), and the models Γk, where k=
1,2,3,4,8,16,32.The last curve shows the model Γ32 without the target curve Γobs.
13
4.2. Dictionary Descent Method
While the formlet pursuit method has the advantage of simplicity, it is far
from optimal, as it ignores most smoothness properties that the error function
might enjoy, aside from the quadratic dependence upon the gain α. As a con-
sequence one must face the tradeoff between accuracy, which requires that the
parameter space be sampled finely, and speed, which limits the capacity of the
dictionary.
We can potentially improve upon the standard dictionary method by em-
ploying a smaller dictionary, and initiating a local gradient descent search from
the mmost promising formlets to determine the formlet parameters that locally
minimize the error function.
Figure 7 compares pursuit for the standard dictionary and dictionary descent
methods on a particular example animal shape: the higher accuracy of the
dictionary descent method is evident. Table 1 shows the performance of the two
methods on the entire shape dataset. The dictionary descent method improves
accuracy by roughly 11%, and runs about 42% faster than standard pursuit. We
use the dictionary descent method in our evaluation below. An implementation
is available at www.elderlab.yorku.ca/formlets.
Figure 7: Pursuit of an example animal shape with standard dictionary search (top row) and
dictionary descent (bottom row) for K=1,2,4,8,16.
Table 1: Comparison of Dictionary and Dictionary Descent methods on entire animal dataset.
Optimization Method L2Error Run Time (min)
Dictionary 0.00535 1.9
Dictionary Descent 0.00476 1.1
5. Implementation Details
5.1. Shape Dataset
To explore the inverse problem of constructing formlet representations of
planar shapes, we employ a database consisting of 391 blue-screened images of
14
animal models from the Hemera Photo-Object database. The boundary of each
object was sampled at 128 points at regular arc-length intervals. Each resulting
polygon was then shifted to have 0 mean and scaled to have unit L2norm in
both vertical and horizontal directions:
Z1
0
Re(Γobs(t))2dt =Z1
0
Im(Γobs(t))2dt = 1.(16)
This scaling generally alters the aspect ratio of the shape: we invert this distor-
tion when displaying our results. The full dataset of object shapes used in this
paper is available at www.elderlab.yorku.ca/formlets.
5.2. Dictionary Method: Discretization
To evaluate this formlet pursuit algorithm, we constructed a dictionary con-
sisting of a regular sampling of the position parameter ζon a 64 ×64 grid
roughly 4 times the extent of the average shape, and the scale parameter σat
16 regularly-spaced values over (0,0.8].
5.2.1. Tuning the Dictionary Descent Method
Since our objective function is the L2-norm of the residual error between
the observed curve and the approximation, we employed the MATLAB function
lsqnonlin(), which is optimized for non-linear least squares problems, and com-
pute the Jacobian of the objective function analytically (Appendix Appendix
B). We tuned the parameters of our Dictionary Descent method in stepwise
fashion. First, we determined appropriate values for the tolerance parameters
xT ol and f T ol of lsqnonlin(), which determine the stopping criteria for the
parameters and error function, respectively. We employed a sparse dictionary,
sampling the position parameter ζon a 16 ×16 grid, and the scale parameter σ
at 4 regularly-spaced values over (0,0.8]. We initiated descent at the m= 100
lowest error solutions. Using a small subset of our animal dataset containing
only four animal shapes, we performed a grid search in log space over the xT ol
and f T ol parameters in the range 10−1to 10−9, computing the average run-
ning time and L2error for a 32-formlet approximation. All experiments were
conducted on a Power Mac G5 with a 2.66 Ghz quad-core Intel Xeon processor,
running MATLAB R2009b.
The results are shown in Table 2. Error was found to be minimized for
parameter values of xT ol = 10−3, f T ol = 10−6: we used these values for all
further experiments.
Second, we optimized the density of the dictionary and number mof dic-
tionary formlets selected for descent, using the descent parameters optimized
above, the same 4 training shapes, and 32-formlet approximation. The running
time and accuracy results are shown in Tables 3 and 4 respectively. Sampling
ζon a 51 ×51 grid, the scale parameter σat 13 values, and launching m= 25
descents from the most promising formlets, we found that for these four training
images we could improve the accuracy over the standard dictionary method by
a factor of more than two, while saving roughly 30% in computation time.
15
Table 2: Average L2error (×100) for a 32-formlet model, as a function of the gradient descent
termination criteria.
fTol/xTol 1E-01 1E-02 1E-03 1E-04 1E-05 1E-06 1E-07 1E-08 1E-09
1E-01 6.86 6.31 6.54 6.54 6.54 6.54 6.54 6.54 6.54
1E-02 5.68 5.26 5.11 5.02 5.02 5.02 5.02 5.02 5.02
1E-03 5.91 4.66 4.04 4.16 4.16 4.16 4.16 4.16 4.16
1E-04 5.80 3.65 4.02 3.72 3.72 3.72 3.72 3.72 3.72
1E-05 5.75 3.69 3.66 3.73 3.90 3.90 3.90 3.90 3.90
1E-06 5.75 3.69 3.52 3.73 3.74 3.74 3.74 3.74 3.74
1E-07 5.75 3.69 3.54 3.71 3.66 3.66 3.66 3.66 3.66
1E-08 5.75 3.69 3.54 3.67 3.66 3.66 3.66 3.66 3.66
1E-09 5.75 3.69 3.54 3.67 3.66 3.66 3.66 3.66 3.66
Table 3: Average running time per shape (min) for a 32-formlet model, as a function of
dictionary size nand number of descents m.
m/n 642×16 582×14 512×13 452×11 382×10 322×8 262×6 192×5
0 1.28 0.93 0.68 0.45 0.29 0.16 0.09 0.04
1 1.38 1.01 0.74 0.50 0.33 0.21 0.11 0.07
5 1.44 1.06 0.78 0.55 0.38 0.24 0.18 0.13
10 1.49 1.10 0.84 0.60 0.44 0.31 0.25 0.20
15 1.56 1.17 0.90 0.67 0.50 0.37 0.32 0.28
20 1.60 1.23 0.95 0.73 0.56 0.42 0.41 0.35
25 1.67 1.28 1.01 0.78 0.62 0.48 0.47 0.42
30 1.72 1.34 1.07 0.84 0.68 0.55 0.53 0.51
Table 4: Average residual (×1000) for a 32-formlet model, as a function of dictionary size n
and number of descents m.
m/n 642×16 582×14 512×13 452×11 382×10 322×8 262×6 192×5
0 8.0 6.5 8.1 9.5 9.3 14.0 23.5 28.1
1 3.7 4.7 4.8 5.8 5.0 5.9 7.4 14.0
5 3.6 3.8 4.7 5.9 4.2 6.4 7.4 7.1
10 4.1 3.7 3.8 45 4.5 6.1 6.9 6.4
15 3.6 3.9 4.1 4.0 4.1 5.7 6.7 8.0
20 3.4 3.8 3.8 4.5 4.1 4.8 6.6 8.0
25 3.7 3.8 3.7 4.5 4.2 4.8 6.5 7.4
30 3.4 3.9 3.7 4.4 4.2 4.4 5.7 7.5
16
Interestingly, we found that tightening tolerance parameters, increasing the
dictionary density, or increasing the number of deployments of the optimizer did
not always decrease the error. However, at a given iteration, error did decrease
monotonically as a function of each of these parameters, as expected. Thus the
non-monotonic variation in error with these parameters appears to reflect the
non-optimality of the greedy pursuit algorithm. In other words, selecting the
formlet that minimizes the residual at stage iwill not necessarily lead to the
smallest error at stage k > i.
6. Evaluation
To evaluate and compare shape models, we address the problem of contour
completion, using our animal shape dataset. In natural scenes, object bound-
aries are often fragmented by occlusion and loss of contrast: contour completion
is the process of filling in the missing parts. Completion can also be an impor-
tant component of perceptual organization algorithms: given one or more partial
contour hypotheses, completion can be used to estimate the locations of missing
parts. These estimate can then guide search for corroborating evidence.
We compare our formlet model with the shapelet model described in Section
2.1 [9]. For each shape in the dataset, we simulate the occlusion of a 10% or
30% continuous section of the contour, and allow the two methods to pursue
only the remaining visible portion.
The rate of convergence of both formlet and shapelet methods depends upon
how the parameters are sampled. For formlet pursuit, we use the dictionary de-
scent method described in Section 4.2. For the shapelet method, we used the
standard dictionary method of Dubinskiy et al. [9], optimizing performance by
sampling as finely as possible given time constraints. The shapelet representa-
tion assumes an arc-length representation of the curves on t∈[0,1], and each
shapelet component has an arc-length position µand scale σ. We sampled the
position parameter µat 128 regularly-spaced values over [0,1], and the scale
parameter σat 128 regularly-spaced values over (0,1]. The affine parameters
were computed analytically [9].
The formlet and shapelet pursuit algorithms were initialized with the same
embryonic ellipses, and were governed by a minimization of the L2error (Equa-
tion 11) over the visible points of the curves only. While pursuit is based on
a fixed 1:1 correspondence between points on the target and model curves, we
measure performance using the L2Hausdorff distance to avoid potential depen-
dence of the evaluation upon the parameterization of the curves. Specifically,
we define the error between the target shape and the model as the average
minimum distance of a point on one of the shapes to the other shape:
ξH(Γobs,Γk) = sZ1
0
1
2min
t0∈[0,1) |Γobs(t)−Γk(t0)|2+ min
t0∈[0,1) |Γobs(t0)−Γk(t)|2dt.
(17)
17
We measured the residual error between the model and target for both the
visible and occluded portions of the shapes. Performance on the occluded por-
tion, where the model is under-constrained by the data, reveals how well the
structure of the model captures properties of natural shapes.
Implementations for both the formlet and shapelet models are available at
www.elderlab.yorku.ca/formlets.
6.1. Results
Figure 8 shows some example qualitative results for this experiment. While
shapelet pursuit introduces topological errors in both visible and occluded re-
gions, formlet pursuit remains topologically valid, as predicted.
Figure 8: Examples of 30% occlusion pursuit with shapelets (red) and formlets (blue) for
k= 0,2,4,8,16,32. Solid lines indicate visible contour, dashed lines indicate occluded contour.
Figure 9 shows quantitative results for this experiment. While the shapelet
and formlet models achieve comparable error on the visible portions of the
boundaries, the error is substantially lower for the formlet representation on
the occluded portions. This suggests that the structure of the formlet model
better captures regularities in the shapes of natural objects. We believe that the
two principal reasons for this are a) respecting the topology of the shape prunes
off many inferior completion solutions and b) by working in the image space,
rather than arc length, the formlet model is better able to capture important
regional properties of shape.
18
Formlet Occluded
Formlet Visible
Shapelet Occluded
Shapelet Visible
30% Occlusion
Normalized RMS Error
Number of Components
10% Occlusion
Normalized RMS Error
Number of Components 010 20 30
0 10 20 30 0
0.01
0.02
0.03
0.04
0.05
0
0.01
0.02
0.03
0.04
0.05
0.06
Figure 9: Results of occlusion pursuit evaluation. Black denotes error for Γ0(t), the affine-fit
ellipse.
7. Discussion
7.1. Formlet Parameter Distributions
The focus of this paper is to establish the appropriate structural properties
for a generative model of planar shape. To ultimately apply this representation
to problems such as object detection and recognition, statistical models over this
representation must be developed. One small step is to consider the distribution
of formlet parameters selected in pursuit of the shapes in our animal dataset.
Figure 10 shows how the means of the formlet parameters vary as pursuit
unfolds. We observe that scales decrease over time (a), reflecting a coarse-to-fine
approximation. Gains also decrease over time (b), although when normalized
by scale (c), this decline is moderated substantially. Finally, formlet locations
are biased to the centre of the shape and are roughly isotropic (d), with a slight
bias to the lower field, presumably reflecting the additional details required to
represent the legs of the animals.
7.2. Alternative Formlet Bases
In this paper we have chosen a particular Gabor-like formlet representation
(Equation 3) that confers several key properties:
1. The family of formlets forms a self-similar scale space.
2. Each formlet acts within a σ-ball around a specific location ζ, converging
to the identity as |z−ζ|→∞.
3. The mapping is smooth everywhere except at ζ, where it is C0.
4. Deformation is isotropic and radial around ζ.
There are of course other formulations that would also satisfy these prop-
erties. Here we consider two specific alternatives and compare them with the
Gabor formulation.
19
Mean scale (σ)
Formlet number k
020 40
0
0.1
0.2
0.3
0.4
(a) Mean scale at each iteration
Mean absolute gain (|α|)
Formlet number k
0 20 40
0
0.02
0.04
0.06
(b) Mean gain at each iteration (expansive
formlets)
Mean abs gain:scale (|α|/σ)
Formlet number k
020 40
0.05
0.1
0.15
0.2
0.25
(c) Mean gain at each iteration (compressive
formlets)
Im(ζ)
Re(ζ)
−0.4−0.20 0.2
−0.4
−0.2
0
0.2
0.4
(d) Location histogram
Figure 10: Marginal distributions of formlet parameters. Error bars indicate standard error
of the mean.
7.3. Gaussian Derivative Formlets
We simplify the original Gabor formulation of Equation 3 by replacing the
sinusoidal factor with a first-order Taylor series approximation, yielding:
f(z;ζ, σ, α) = ζ+z−ζ
|z−ζ|ρ(|z−ζ|),where (18)
ρ(r) = r+α2πr
σexp −r2
σ2(19)
Note that the deformation term of the radial deformation function ρ(r) is
proportional to the first Gaussian derivative in r.
fis a diffeomorphism iff ρ0(r)>0 everywhere:
ρ0(r) = 1 + exp −r2
σ22πα
σ1−2r2
σ2>0.(20)
20
For α < 0, the minimum is attained when r= 0:
⇒α > −1
2πσ(21)
For α > 0, by solving ρ00(r) = 0 it can be shown that the minimum is attained
when r=p3/2σ. Substituting into Equation 20 then yields
α < exp(3/2)
4πσ(22)
Thus fis a diffeomorphism iff α∈σ
2π−1,1
2exp(3/2).
7.4. Spline Formlets
Both the Gabor and Gaussian formlets have infinite support, which increases
computation time and limits the degree to which formlets can be computed in
parallel. To achieve strictly compact support we impose the constraint that
ρ(r;σ) = r⇐⇒ f(z) = zwhenever r > σ. To guarantee smoothness, we
require ρ(σ;σ) = σand ρ0(σ;σ) = 1 and to achieve continuity at ζwe require
ρ(0) = 0. The simplest spline meeting all these conditions is:
ρ(r;σ) = (r+αr
σ2(r−σ)2for r≤σ
rfor r > σ (23)
We derive the diffeomorphism constraints as before:
ρ0(r) = 1 + α
σ2(r−σ)2+r·2(r−σ)>0 (24)
⇒α
σ23r2−4rσ +σ2>−1 (25)
For α < 0, the minimum is attained when r= 0, yielding α > −1.
For α > 0, by solving ρ00(r) = 0 it can be shown that the minimum is
attained when r= 2σ/3. Substituting into Equation 24 then yields α < 3.
Thus fis a diffeomorphism iff α∈(−1,3)
7.5. Comparison of Formlet Bases
Figures 11 - 12(b) show the radial deformation functions, examples of pursuit
and rate of convergence for these three different formulations. Empirically, we
find that the Gabor formulation achieves a better rate of convergence on the
animal dataset than the competing formulations, although at this stage we do
not have a clear theoretical explanation for this result.
21
Spline
Gaussian
Gabor
ρ(r)
r
0 10 20
0
5
10
15
20
Figure 11: Radial deformation function for three formlet bases.
8. Conclusion
We have developed a novel generative model of planar shape that satisfies
a number of essential properties. In this model, complex shapes are seen as
the evolution of a simple embryonic shape by successive application of simple
diffeomorphic transformations of the plane called formlets. The system is both
complete and closed, since arbitrary shapes can be modeled, and generated
shapes are guaranteed to be topologically valid. This means that the model has
the potential to support accurate probabilistic modeling. We have demonstrated
a novel dictionary descent formlet pursuit algorithm that selects formlets to ef-
ficiently approximate given target shapes. Evaluation of the formlet pursuit
model on the problem of shape completion revealed that the model is better
able to approximate parts of shapes missing due to occlusion than a competing
contour-based method. Our animal object dataset, experimental results, exam-
ple movies and implementations for both the formlet and shapelet models are
available at www.elderlab.yorku.ca/formlets.
Future Work. We hope to extend the present work in a number of ways. First,
we would like to generalize our definition of formlets to allow for anisotropic
deformation that could efficiently model elongated parts such as animal limbs.
Second, we would like to develop probabilistic models over the formlet repre-
sentation. Finally, we are interested in using the formlet pursuit algorithm for
contour grouping, using detected fragments to generate predictions for where
other fragments of the same object boundary might be found.
22
(a) Pursuit of an example shape with Gabor (blue), Gaussian (green) and spline (red) bases for
K=1,2,4,8,16.
Spline
Gaussian
Gabor
Normalized RMS Error
Number of Components
0 10 20 30
0.005
0.01
0.015
0.02
(b) Mean L2Hausdorff error for formlet pursuit over animal dataset
with three different formlet bases.
Figure 12: Comparison of the three different formlet bases.
23
References
[1] S. Belongie, J. Malik, J. Puzicha, Shape matching and object recognition
using shape contexts, Pattern Analysis and Machine Intelligence, IEEE
Trans. 24 (2002) 509–522.
[2] P. Cavanagh, What’s up in top-down processing, in: A. Gorea (Ed.),
Representations of Vision, Cambridge University Press, Cambridge, UK,
1991 edition, 1991, pp. 295–304.
[3] D. Mumford, Mathematical theories of shape: do they model perception?,
in: B. C. Vemuri (Ed.), Geometric Methods in Computer Vision, volume
1570, SPIE, 1991, pp. 2–10.
[4] F. Attneave, Some informational aspects of visual perception, Psychol.
Rev. 61 (1954) 183–193.
[5] T. F. Cootes, C. J. Taylor, D. H. Cooper, J. Graham, Active shape models
- their training and application, Comput. Vis. Image Underst. 61 (1995)
38–59.
[6] T. Pavlidis, Structural pattern recognition, volume 1, Springer-Verlag,
Berlin, illustrated edition, 1977.
[7] D. D. Hoffman, W. A. Richards, Parts of recognition, Cognition 18 (1984)
65–96.
[8] F. Mokhtarian, A. Mackworth, Scale-based description and recognition of
planar curves and two-dimensional shapes, Pattern Analysis and Machine
Intelligence, IEEE Trans. 8 (1986) 34–43.
[9] A. Dubinskiy, S. Zhu, A multiscale generative model for animate shapes
and parts, in: Proc. 9th IEEE ICCV, volume 1, pp. 249–256.
[10] H. Blum, Biological shape and visual science (part i), J. Theoretical Biology
38 (1973) 205–287.
[11] H. Blum, R. N. Nagel, Shape description using weighted symmetric axis
features, Pattern Recognition 10 (1978) 167 – 180.
[12] M. Brady, H. Asada, Smoothed local symmetries and their implementation,
Int. J. Robotics Res. 3 (1984) 36–61.
[13] M. Leyton, A process-grammar for shape, Artificial Intelligence 34 (1988)
213 – 247.
[14] B. B. Kimia, A. R. Tannenbaum, S. W. Zucker, Shapes shocks and de-
formations i: the components of two dimensional shape and the reaction
diffusion space, Int. J. Comput. Vision 15 (1995) 189–224.
24
[15] S. Osher, J. A. Sethian, Fronts propagating with curvature-dependent
speed, J. Comput. Phys. 79 (1988) 12–49.
[16] N. Trinh, B. Kimia, A symmetry-based generative model for shape, in:
Proc. 11th IEEE ICCV, pp. 1–8.
[17] A. Pentland, S. Sclaroff, Closed-form solutions for physically based shape
modeling and recognition, Pattern Analysis and Machine Intelligence,
IEEE Transactions on 13 (1991) 715–729.
[18] S. Scarloff, A. Pentland, Modal matching for correspondence and recogni-
tion., Pattern Analysis and Machine Intelligence, IEEE Trans. 17 (1995)
545–561.
[19] S.-C. Zhu, Embedding gestalt laws in markov random fields, Pattern
Analysis and Machine Intelligence, IEEE Trans. 21 (1999) 1170–1187.
[20] D. W. Thompson, On growth and form, Cambridge University Press, Cam-
bridge, UK, abridged ed./edited edition, 1961.
[21] A. Jain, Y. Zhong, S. Lakshmanan, Object matching using deformable
templates, Pattern Analysis and Machine Intelligence, IEEE Trans. 18
(1996) 267–278.
[22] E. Sharon, D. Mumford, 2d-shape analysis using conformal mapping, Com-
puter Vision and Pattern Recognition, IEEE Comp. Soc. Conf. 2 (2004)
350–357.
[23] U. Grenander, A. Srivastava, S. Saini, A pattern-theoretic characterization
of biological growth, Medical Imaging, IEEE Trans. 26 (2007) 648–659.
[24] T. Oleskiw, J. Elder, G. Peyr´e, On growth and formlets, Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
(2010).
[25] S. Mallat, Z. Zhang, Matching pursuits with time frequency dictionaries,
Signal Processing, IEEE Trans. 41 (1993) 3397–3415.
Appendix A. Computation of Optimal Gain
Since the formlet deformation of Equation 3 is linear in the gain α, given
fixed location ζand scale σparameters, the gain that minimizes the L2deviation
from the target shape can be computed analytically. Specifically, suppose that
the observed curve Γobs is currently approximated by Γk−1. For given formlet
location and scale parameters ζand σ, we define the optimal unconstrained gain
α∗for formlet fkas:
α∗= argmin
α∈R
ξ(Γobs, f (Γk−1;ζ, σ, α)) (A.1)
25
where, for curves aand b,ξ(Γa,Γb) denotes the L2error metric
Z1
0
Re Γa(t)−Γb(t)2+Im Γa(t)−Γb(t)2dt. (A.2)
induced by the inner product:
Γa,Γb=Z1
0
Re Γa(t)Re Γb(t) + Im Γa(t)Im Γb(t)dt (A.3)
Using Equation 13, we differentiate ξwith respect to αand set to zero:
∂
∂α kΓobs −f(Γk−1)k2=∂
∂α kΓres −αgk2(A.4)
=∂
∂α kΓresk2−2αhΓres , gi+α2kgk2(A.5)
= 2(hΓres, g i − αkgk2) = 0 (A.6)
⇒α=hΓres, g i
kgk2
where we used the shorthand g=gΓk−1−ζ;σ,Γres = Γobs −Γk−1.
As a result, given fixed ζand σ, the optimal unconstrained gain α∗that
maximally reduces the L2error between the observed curve Γobs and current
approximation Γk−1is given by
α∗=Γobs −Γk−1, g Γk−1−ζ;σdt
g(Γk−1−ζ;σ)
2
2
.(A.7)
Note that in general Equation A.7 may produce an optimal gain outside
the diffeomorphism bounds of Equation 7. However, the optimal gain that
satisfies the constraint is simply the unconstrained gain α∗thresholded by the
diffeomorphism constraints, as described in Section 4.1.
Appendix B. Jacobian Computation for Nonlinear Least Squares Min-
imization
The dictionary descent optimization method described in Section 4.2 em-
ploys the MATLAB gradient descent method lsqnonlin to determine the loca-
tion parameter ζand scale parameter σ.lsqnonlin uses the Jacobian of the
error function in the unknown parameters to iterate toward the local minimum.
The method performs best if an analytic form of the Jacobian can be provided.
Note that since the optimal gain α∗
cis determined analytically (Equation 15),
this value must be used in all computations of the Jacobian in order to determine
locally optimally values for the other parameters.
26
Combining Equations 11 and 13, and using r=Γk−1−ζ, the error function
can be written as
ξ(Γobs,ΓK) = k(Γobs −f(Γk−1)k2
=
Γobs −Γk−1−α∗
c
Γk−1−ζ
rsin 2πr
σexp −r2
σ2
2
.
Now defining
Γres = Γobs −Γk−1
and
G=G(r;σ) = 1
rsin 2πr
σexp −r2
σ2,
and using xand ysubscripts to denote real and imaginary components, we can
rewrite this expression as
ξ(Γobs,ΓK) = Z1
0(Γres
x(t)) −α∗
c(Γk−1
x(t)−ζx)G2dt
+Z1
0(Γres
y(t)) −α∗
c(Γk−1
y(t)−ζy)G2dt
≡Z1
0ξx(t)2+ξy(t)2dt.
Since the error is a function of the optimal gain α∗
cand α∗
cis a function of
the location parameter ζand the scale parameter σ, we will need the partial
derivative of α∗
cwith respect to these two parameters. From Equation 15, we
have
α∗
c=
αlfor α∗< αl
α∗for αl≤α∗≤αu
αufor α∗> αu,
where
αl=−(2π)−1σ
and
αu≈0.1956σ,
and α∗is given by Equation 14. Thus we have
∂α∗
c
∂σ =
−(2π)−1for α∗< αl
∂α∗
∂σ for αl≤α∗≤αu
0.1956 for α∗> αu
27
∂α∗
c
∂ζx
=
0 for α∗< αl
∂α∗
∂ζx
for αl≤α∗≤αu
0 for α∗> αu
∂α∗
c
∂ζy
=
0 for α∗< αl
∂α∗
∂ζy
for αl≤α∗≤αu
0 for α∗> αu
Thus to determine the partial derivatives of the constrained gain α∗
c, we must
compute the partial derivatives of the unconstrained gain α∗, which is defined
by Equation 14:
α∗=Γobs −Γk−1, g Γk−1−ζ;σdt
g(Γk−1−ζ;σ)
2
2
.
where we have used
g(Γk−1−ζ;σ) = (Γk−1−ζ)1
rsin 2πr
σexp −r2
σ2= (Γk−1−ζ)G(r).(B.1)
Computing the partial derivatives with respect to the scale σparameter and
location parameters ζxand ζy, we obtain:
∂α∗
∂σ =∂
∂σ hΓres , gikgk2− hΓres , gi∂
∂σ kgk2
kgk4, where:
∂
∂σ hΓres, gi=∂
∂σ Z1
0
Γres
x(t)(Γk−1
x(t)−ζx)G+ Γres
y(t)(Γk−1
y(t)−ζy)Gdt
=Z1
0Γres(t),Γk−1(t)−ζ∂ G
∂σ dt
∂
∂σ kgk=∂
∂σ Z1
0
[(Γk−1
x(t)−ζx)G]2+ [(Γk−1
y(t)−ζy)G]2dt
=Z1
0
2G∂G
∂σ kΓk−1(t)−ζk2
∂α∗
∂ζx
=∂
∂ζxhΓres , gikgk2− hΓres , gi∂
∂ζxkgk2
kgk4, where:
28
∂
∂ζx
hΓres, g i=∂
∂ζxZ1
0
Γres
x(t)(Γk−1
x(t)−ζx)G+ Γres
y(t)(Γk−1
y(t)−ζy)Gdt
=Z1
0
Γres
x(t)−G+ (Γk−1
x(t)−ζx)∂G
∂ζx+ Γres
y(t)(Γk−1
y(t)−ζy)∂G
∂ζxdt
∂
∂ζx
kgk2=∂
∂ζxZ1
0
[(Γk−1
x(t)−ζx)G]2+ [(Γk−1
y(t)−ζy)G]2dt
=Z1
0
2(Γk−1
x(t)−ζx)G−G+ (Γk−1
x(t)−ζx)∂G
∂ζx+ 2G∂G
∂ζx
(Γk−1
y(t)−ζy)2dt
∂α∗
∂ζy
=∂
∂ζyhΓres , gikgk2− hΓres , gi∂
∂ζykgk2
kgk4, where:
∂
∂ζy
hΓres, g i=∂
∂ζyZ1
0
Γres
x(t)(Γk−1
x(t)−ζx)G+ Γres
y(t)(Γk−1
y(t)−ζy)Gdt
=Z1
0
Γres
x(t)(Γk−1
x(t)−ζx)∂G
∂ζy+ Γres
y(t)−G+ (Γk−1
y(t)−ζy)∂G
∂ζydt
∂
∂ζy
kgk2=∂
∂ζyZ1
0
[(Γk−1
x(t)−ζx)G]2+ [(Γk−1
y(t)−ζy)G]2dt
=Z1
0
2G∂G
∂ζy
(Γk−1
x(t)−ζx)2+ 2(Γk−1
y(t)−ζy)G−G+ (Γk−1
y(t)−ζy)∂G
∂ζydt
We are now ready to compute the Jacobian matrix. From Equation B.1 we have
that:
ξx(ti)=Γres
x(ti)−αΓk−1
x(ti)−ζxG
ξy(ti)=Γres
y(ti)−αΓk−1
y(ti)−ζyG
Thus,
∂ξx(ti)
∂σ =−(Γk−1
x(ti)−ζx)∂α
∂σ G+α∂G
∂σ (B.2)
∂ξy(ti)
∂σ =−(Γk−1
y(ti)−ζy)∂α
∂σ G+α∂G
∂σ (B.3)
∂ξx(ti)
∂ζx
=−∂α
∂ζx
(Γk−1
x(ti)−ζx)G+αG −α(Γk−1
x(ti)−ζx)∂G
∂ζx
(B.4)
∂ξy(ti)
∂ζx
=−(Γk−1
y(ti)−ζy)∂α
∂ζx
G+α∂G
∂ζx(B.5)
∂ξx(ti)
∂ζy
=−(Γk−1
x(ti)−ζx)∂α
∂ζy
G+α∂G
∂ζy(B.6)
∂ξy(ti)
∂ζy
=−∂α
∂ζy
(Γk−1
y(ti)−ζy)G+αG −α(Γk−1
y(ti)−ζy)∂G
∂ζy
(B.7)
29
where for the Gabor basis, we have:
∂G
∂σ = exp −r2
σ2−2π
σ2cos 2πr
σ+2r
σ3sin 2πr
σ
∂G
∂ζx
= exp −r2
σ2(Γk−1
x(ti)−ζx)
r−2π
σr cos 2πr
σ+2
σ2sin 2πr
σ+1
r2sin 2πr
σ
∂G
∂ζy
= exp −r2
σ2(Γk−1
y(ti)−ζy)
r−2π
σr cos 2πr
σ+2
σ2sin 2πr
σ+1
r2sin 2πr
σ
It is straightforward to show that Equations B.2 - B.7 also apply to the
Gaussian and Spline bases (Section 7.2), with suitable definitions of G(r;σ):
Gaussian Basis:
G(r;σ) = 2π
σexp −r2
σ2
∂G
∂σ = 2πexp −r2
σ2−1
σ2+2r2
σ4
∂G
∂ζx
=4π
σ3exp −r2
σ2(Γk−1
x−ζx)
∂G
∂ζy
=4π
σ3exp −r2
σ2(Γk−1
y−ζy)
Spline Basis:
G(r;σ) = (r−σ)2
σ2
∂G
∂σ =−2
σ3(r−σ)2−2
σ2(r−σ)
∂G
∂ζx
=−2
rσ2(r−σ)(Γk−1
x−ζx)
∂G
∂ζy
=−2
rσ2(r−σ)(Γk−1
y−ζy)
30