Content uploaded by Timothy Oleskiw

Author content

All content in this area was uploaded by Timothy Oleskiw on Jun 11, 2018

Content may be subject to copyright.

Content uploaded by Timothy Oleskiw

Author content

All content in this area was uploaded by Timothy Oleskiw on Jun 11, 2018

Content may be subject to copyright.

On Growth and Formlets: Sparse Multi-Scale Coding of

Planar Shape

James H. Eldera,∗, Timothy D. Oleskiwb, Alex Yakubovicha, Gabriel Peyr´ec

aCentre for Vision Research, York University, Toronto, Canada

bDepartment of Applied Mathematics, University of Washington, Seattle, WA, United States

cCEREMADE, Universit´e Paris-Dauphine, Paris, France

Abstract

We propose a sparse representation of 2D planar shape through the composi-

tion of warping functions, termed formlets, localized in scale and space. Each

formlet subjects the 2D space in which the shape is embedded to a localized

isotropic radial deformation. By constraining these localized warping transfor-

mations to be diﬀeomorphisms, the topology of shape is preserved, and the set

of simple closed curves is closed under any sequence of these warpings. A gener-

ative model based on a composition of formlets applied to an embryonic shape,

e.g., an ellipse, has the advantage of synthesizing only those shapes that could

correspond to the boundaries of physical objects. To compute the set of formlets

that represent a given boundary, we demonstrate a greedy coarse-to-ﬁne formlet

pursuit algorithm that serves as a non-commutative generalization of matching

pursuit for sparse approximations. We evaluate our method by pursuing par-

tially occluded shapes, comparing performance against a contour-based sparse

shape coding framework.

Keywords: planar shape, deformation, sparse coding sep contour completion

1. Introduction

Shape information is important for a broad range of computer vision prob-

lems. For some detection and recognition tasks, discriminative models that use

non-invertible shape codes (e.g., [1]) can be eﬀective. However, many other

tasks call for a more complete generative model of shape. Examples include:

(1) shape segmentation, recognition, and tracking in cluttered scenes, where

∗Corresponding author.

Email addresses: jelder@yorku.ca (James H. Elder), oleskiw@uw.edu (Timothy D.

Oleskiw), yakuboa@yorku.ca (Alex Yakubovich), gabriel.peyre@ceremade.dauphine.fr

(Gabriel Peyr´e)

URL: www.yorku.ca (James H. Elder),

http://depts.washington.edu/amath/people/Timothy.Oleskiw (Timothy D. Oleskiw),

elderlab.yorku.ca/∼alex (Alex Yakubovich), http://www.ceremade.dauphine.fr/∼peyre

(Gabriel Peyr´e)

Preprint submitted to Image and Vision Computing January 4, 2013

shapes must be distinguished not just from each other, but from ‘phantom’

shapes formed by conjunctions of features from multiple objects [2]; (2) model-

ing of shape articulation, growth, and deformation; and (3) modeling of shape

similarity.

Our paper concerns the generative modeling of natural 2D shapes in the

plane, represented by their 1D boundary. We restrict our attention to simply-

connected shapes whose boundaries are smooth, simple, and closed curves. We

seek a generative shape model that satisﬁes a set of properties that seem to us

essential:

1. Completeness. The model can produce all shapes.

2. Closure. The set of valid shapes is closed under the generative model. In

other words, the model generates only valid shapes.

3. Composition. Complex shapes are generated by combining simpler com-

ponents.

4. Sparsity. Good approximations of shape can be generated with relatively

few components.

5. Progression. Approximations can be improved by incorporating more

components.

6. Locality. Components are localized in space.

7. Scaling. Components are tuned to speciﬁc scales and are self-similar over

scale.

8. Region & Contour. Components can capture both region and contour

properties in a natural way.

The need for completeness is self-evident if the system is to be general.

Closure is critical if we hope to capture the statistics of natural shape in a set of

hidden generative variables. Without closure, heuristics must be used to avoid

the generation of invalid shapes, e.g., bounding contours with self-intersections.

Aside from the resulting ineﬃciency, this creates a discrepancy between the

statistical structure encoded by the model, and samples the model produces. In

other words, the model cannot fully capture the statistics of natural boundaries.

Composition (here we use the word in a general sense) is important if we are

to handle the richness and complexity of natural shapes while maintaining con-

ceptual simplicity. Given the high dimensionality of natural shapes, sparsity is

necessary in order to store shape models [3]. Sparsity also implies that essential

shape features have been made explicit [4]. Progression allows the complexity

of the model to be matched to the diﬃculty of the task, facilitating real-time

operation and coarse-to-ﬁne optimization.

Locality is a natural goal, since a ﬁrst-order property of natural images

is local coherence. Nearby points on the surface of an object tend to have

similar reﬂectance, attitude, and illumination. Locality also allows for greater

robustness to occlusion, since components are more likely to be either entirely

visible or removed altogether rather than distorted. Scaling allows invariance

over object size, and allows shape features of diﬀerent sizes to be captured

separately.

2

Finally, it has long been recognized that planar shape description requires

attention to both region and contour properties [3]. Some shape properties,

e.g., curvature, are naturally described by the bounding contour. Others, e.g.,

necks, are best described as region properties, since they involve points that are

proximal in the image but distant along the contour. A good generative model

will allow both to be encoded in a natural way.

We begin by reviewing prior models, with an eye to each of these essential

properties.

2. Prior Work

Early models that used chain coding or splines to encode shapes were not

generative and failed to succinctly capture global properties of shape. Fourier

descriptor, moment, and PCA bases have the potential to be generative, but

since all components are global, they are not robust to occlusion or local defor-

mation [5, 3, 6]. For these reasons, most modern approaches attempt to capture

structure at intermediate scales, or over a range of scales. Most of these models

can be crudely partitioned into two classes: contour-based and symmetry-based.

2.1. Contour-Based Models

Attneave [4] pointed to the concentration of information in the curvature of

the bounding contour, and suggested the potential for sparse descriptions based

on points of extremal curvature magnitude. Hoﬀman & Richards [7] linked cur-

vature to the part structure of shapes, proposing that parts are perceptually

segmented at negative minima of curvature. Mokhtarian & colleagues empha-

sized the encoding of curvature inﬂections across scale space for the purpose of

shape recognition [8].

While none of these early models are generative, Dubinskiy & Zhu [9] have

more recently proposed a contour-based shape representation that is both gen-

erative and sparse. The theory is based upon the representation of a shape by

a summation of component shapelets. A shapelet is a primitive curve deﬁned

by Gabor-like coordinate functions that map arclength to the image, which can

be represented by the complex plane.

Speciﬁcally, a shapelet γ(t;σ, µ) is a mapping of arc length t∈[0,1] to the

image, represented by the complex plane. Each shape let is parameterized by

an arc length position parameter µand a scale parameter σ, and has the speciﬁc

form:

γ(t;σ, µ) = exp −(t−µ)2

2σ2cos 2π

σ(t−µ)+isin 2π

σ(t−µ).(1)

Figure 1 shows the coordinate functions and trace of an example shapelet. Note

that the planar curves generated by γ(t;σ, µ) are identical on t∈Rup to a

linear reparameterization, i.e., they are self-similar. However, these functions

are only approximately self-similar on any ﬁnite domain over which a curve will

3

be deﬁned. Also, note that γdoes not in general generate a simple closed curve.

In fact, as σ→0, the number of sinusoidal periods on the interval t∈[0,1]

explodes, generating an inﬁnite number of self-intersections.

y(t)

x(t)

t

0.20.40.60.8

−0.1

−0.05

0

0.05

0.1

(a) Component functions

y(t)

x(t)

−0.05 00.05 0.1

−0.1

−0.05

0

0.05

0.1

(b) Generated image trace

Figure 1: An example shapelet.

Shifting and scaling shapelets over arclength produces a basis set suﬃcient

to generate arbitrarily complex shapes. In particular, a K-shapelet curve ΓK(t)

can be deﬁned as:

ΓK(t) = z0+

K

X

k=1

Akγ(t;σk, µk),(2)

4

where the 2 ×2 matrix Akapplies an aﬃne transformation to each shapelet in

image space prior to linear combination.

Dubinskiy & Zhu’s shapelet model has many positive features. Components

are localized, albeit only in arclength, and scale is made explicit in a natural way.

However, like all contour-based methods, the shapelet theory does not explicitly

capture regional properties of shape. Perhaps most crucially, the model does

not respect the topology of object boundaries: sampling from the model will in

general yield non-simple, i.e., self-intersecting, curves (Figure 2). This violates

the closure criterion identiﬁed in Section 1.

Figure 2: Sampling from the shapelet model generally yields non-simple curves.

2.2. Symmetry-Based Models

Blum and colleagues [10, 11] introduced the symmetry axis representation

of shape in which a planar shape is represented by a 1D skeleton function and

associated 1D radius function. The symmetry axis representation led to re-

lated representations [12] which found application in medical imaging and other

domains.

Subsequent work incorporated notions of scale and time with symmetry axis

descriptions. Leyton [13] related symmetry axis descriptions to causal defor-

mation processes acting upon prototype shapes. In this view, symmetry axes,

terminating at curvature extrema on the boundary, are understood as records

of these deformation processes. Subsequent work on curve evolution methods

and shock-graph representations [14, 15] has provided a more complete theory

of region-based shape representation that has been broadly applied.

Despite the many appealing features of symmetry axis and shock-graph rep-

resentations, these methods, in general, are not sparse. In fact, the description of

5

each shape typically requires more storage, and little emphasis has been placed

on making symmetry axis representations generative [3]. Recent work of Trinh

and Kimia exploring generative and sparse models based upon shock graphs

comes some way in overcoming these limitations [16]. However, the constraints

required to enforce the closure property, i.e., topological constraints, are fairly

complex, and the full potential of the theory has yet to be explored.

A related approach to shape representation (e.g., [17, 18] employs ﬁnite

element modelling techniques to code the bounding contour in terms of the

free vibration modes of the shape, which are said to correspond to the object’s

generalized axes of symmetry. The main diﬃculty in developing this approach

into a generative model is that points on the boundary are coupled only locally

in the intrinsic coordinates of the shape boundary, thus nothing constrains the

topology of generated shapes.

2.3. Hybrid Approaches

Recognizing the merits and limitations of both contour-based and symmetry-

based approaches, Zhu [19] developed an MRF model for natural 2D shape, em-

ploying a neighbourhood structure that can directly encode both contour-based

and region-based Gestalt principles. The theory is promising in many respects.

It is generative, providing an explicit probabilistic model, and it captures both

region and contour properties. It is not sparse, however, and because the un-

derlying graph is lifted from the image plane, there is nothing in the model

that encodes the topological constraint that the boundary be simple, i.e., non-

intersecting. Instead, when sampling from the model, a ‘ﬁrewall’ is employed

to prevent intersections. Again, this is ineﬃcient, and it also creates a dis-

connect between the generative variables encoding the model and the sampling

distribution.

2.4. Coordinate Transformations

A diﬀerent class of model that could also be called region-based involves the

application of coordinate transformations of the planar space in which a shape

is embedded. This idea can be traced back at least to D’Arcy Thompson, who

considered speciﬁc classes of global coordinate transformations of the plane to

model the relationship between the shapes of diﬀerent animal species [20]. In

the ﬁeld of computer vision, Jain et al. [21] were among the ﬁrst to extend this

idea to more general deformations with a complete Fourier deformation basis

that they used to match observed shapes to stored prototypes. However, this

Fourier basis fails to satisfy the locality property, and as a potential genera-

tive model it does not satisfy the closure property: random combinations of

Fourier deformation components will not in general preserve the topology of the

prototype curve.

More recently, Sharon & Mumford [22] have explored conformal mappings as

global coordinate transformations between planar shapes. However, although

the Riemann mapping theorem guarantees that any simple closed curve can

6

be conformally mapped to the unit circle, conformal mappings do not in gen-

eral preserve the topology of embedded contours. Hence, despite the compu-

tational constraints imposed by the Cauchy-Riemann equations, we again have

the problem that the set of valid bounding contours is not closed under these

transformations, making generative modeling diﬃcult.

2.5. Localized Diﬀeomorphisms: Formlets

In considering prior generative shape models, the goal that seems most elu-

sive is that of closure: ensuring that the model generates only valid shapes. Our

approach originates with the observation that, while general smooth coordinate

transformations of the plane will not preserve the topology of an embedded

curve, it is straightforward to design a speciﬁc family of diﬀeomorphic transfor-

mations that will. It then follows immediately by induction that a generative

model based upon arbitrary sequences of diﬀeomorphisms will satisfy the closure

property.

In this paper we speciﬁcally consider a family of diﬀeomorphisms we call

formlets. A formlet is a simple, isotropic, radial deformation of planar space

that is localized within a speciﬁed circular region of a selected point in the plane.

The family comprises formlets over all locations and spatial scales. While the

gain of the deformation is also a free parameter, it is constrained to satisfy a

simple criterion that guarantees that the formlet is a diﬀeomorphism. Since

topological changes in an embedded ﬁgure can only occur if the deformation

mapping is either discontinuous or non-injective, these diﬀeomorphic deforma-

tions are guaranteed to preserve the topology of embedded ﬁgures. Thus the

model satisﬁes the closure property.

By construction, formlets satisfy the desired locality and scaling proper-

ties. It is straightforward to show that the model also satisﬁes the composition,

completeness, and progression properties in that an arbitrary shape can be ap-

proximated to increasing precision by composing an appropriate sequence of

localized formlets. Since each formlet may be centered either near the contour,

near a symmetry axis, or at any other location in the plane, the model has the

potential to capture both region and contour properties directly.

Our formlet model is closely related to recent work by Grenander et al.

[23], modeling changes to anatomical parts over time. Their representation,

called Growth by Random Iterated Diﬀeomorphisms (GRID), models growth

as a sequence of local and radial deformations. They demonstrate their model

by tracking growth in the rat brain, as revealed in sequential planar sections of

MRI data.

In the present paper we explore the possibility that these ideas could be

extended to model not just diﬀerential growth between sequential shapes, but

to serve as the basis for a generative model over the entire space of smooth

shapes, based upon a universal embryonic shape in the plane such as an ellipse.

Elements of the present paper were ﬁrst reported at CVPR [24]. The main

contributions of this conference paper were:

1. We illustrated the completeness and closure properties of the formlet

model through random generation of sample shapes.

7

2. To solve the inverse problem of modeling given shapes, we developed and

applied a generalization of matching pursuit, which selects the sequence of

formlets that minimizes approximation error. We demonstrated that this

formlet pursuit algorithm allows for progressive approximation of shape,

while preserving topological properties.

3. We assessed the robustness of the formlet model to occlusion by evaluat-

ing it on the problem of contour completion. We found that the model

compares favourably with the contour-based shapelet model [9] on this

important problem.

In the present paper we elaborate substantially on these contributions, in-

cluding full derivations and complete implementation details. But we also build

on this work with several important new contributions:

1. We introduce a method for handling analytically computed optimal gain

values that exceed the diﬀeomorphism bounds.

2. We develop and evaluate an improved parameter optimization method

called dictionary descent, and show that it increases accuracy by 11% and

decreases run time by 42%, relative to standard dictionary pursuit.

3. We provide derivations for the Jacobian required for this new dictionary

descent method.

4. We develop, evaluate and compare several alternative mathematical for-

mulations of the formlet function.

5. We report statistics of formlet model parameters for our database of ani-

mal shapes, demonstrating coarse-to-ﬁne scaling properties and an inter-

esting anisotropy in the location distribution.

3. Formlet Coding

3.1. Formlet Bases

We represent the image in the complex plane C, and deﬁne a formlet f:C→C

to be a diﬀeomorphism of the complex plane localized in scale and space. Such

a deformation can be realized by centering fabout the point ζ∈Cand allow-

ing fto deform the plane within a (σ∈R+)-region of ζ. Our Gabor-inspired

deformation is deﬁned as

f(z;ζ, σ, α) = ζ+z−ζ

|z−ζ|ρ(|z−ζ|;σ, α),where

ρ(r;σ, α) = r+αsin 2πr

σexp −r2

σ2.

(3)

Thus each formlet f:C→Cis a localized isotropic and radial deformation

of the plane at location ζand scale σ. The magnitude of the deformation

is controlled by the gain parameter α∈R. Figure 3 demonstrates formlet

deformations of the plane with positive and negative gain.

8

(a) Expansion (α > 0) (b) Compression (α < 0)

Figure 3: Example formlet deformations. The location of the formlet ζis indicated by the

asterisk.

3.2. Diﬀeomorphism Constraint

Without any constraints on the parameters, these deformations, though con-

tinuous, can fold the plane on itself, changing the topology of an embedded con-

tour. In order to preserve topology, we must constrain the gain parameter to

guarantee that each deformation is a diﬀeomorphism. As the formlets deﬁned

in Equation 3 are both isotropic and angle preserving, it is suﬃcient to require

that the radial deformation ρbe a diﬀeomorphism of R+, i.e., that ρ(r;σ, α) be

strictly increasing in r:

∂

∂r ρ(r;σ, α)>0

⇒α∂

∂r sin 2πr

σexp −r2

σ2>−1

⇒2α

σexp −r2

σ2πcos 2πr

σ−r

σsin 2πr

σ>−1

(4)

For α < 0, it is easy to see that the minimal slope of ρis attained as r→0+.

Evaluating Equation 4 at r= 0 thus yields the lower-bound on the gain α:

α > −σ

2π.(5)

For positive α, the location of the minimum in ρ0(r) does not have a closed

form solution, but can be computed numerically:

α/0.1956σ. (6)

Thus the diﬀeormorphism constraint is:

α∈σ−1

2π,0.1956.(7)

9

By enforcing this constraint, we guarantee that the formlet f(z , ζ, σ, α) is a

diﬀeomorphism of the plane. Hence, such a formlet acting on a curve embedded

in the plane will be a homeomorphism. In particular, let Γ be the continuous

mapping

Γ : [0,1] →C.(8)

Recall that Γ is simple if the mapping is injective, and closed by permitting the

equality Γ(0) = Γ(1). Since a formlet fsatisfying Equation 7 is bicontinuous,

if Γ is simple and closed, the deformed curve

Γf(t) = f(Γ(t)) (9)

will also be simple and closed.

Figures 4(a) and (b) show the radial deformation function ρ(r;σ, α) as a

function of rfor a range of gain αand scale σvalues respectively. Figures 4(c)

and (d) show the corresponding trace of the formlet deformation of an ellipse

in the plane.

ρ

r

(a) ρwith gain

variation

ρ

r

(b) ρwith scale

variation

y

x

(c) fwith gain

variation

y

x

(d) fwith scale

variation

Figure 4: Formlet transformations as a function of scale and gain. Dashed lines denote invalid

formlet parameters outside the diﬀeomorphism bounds of Equation 7.

3.3. Formlet Composition

The power of formlets is that they can be composed to produce complex

shapes while preserving topology. We deﬁne the forward formlet composition

problem as follows. Given an embryonic shape Γ0(t) and a sequence of K

formlets {f1. . . fK}drawn from a formlet dictionary D, determine the resulting

deformed shape ΓK(t). The problem is well-posed because the set of simple

closed curves is closed under formlet deformation: multiple formlets can be

composed to generate complex shape transformations. Thus,

ΓK(t)=(fK◦fK−1◦ · · · ◦ f1)(Γ0(t)).(10)

Figure 5 shows an example of forward composition from a circular embryonic

shape, where the formlet parameters ζ , σ, and αhave been randomly selected.

Note that a rich set of complex shapes is generated without leaving the space

of valid shapes (simple, closed contours).

A more diﬃcult but interesting problem is inverse formlet composition: given

an observed shape Γobs(t), determine a sequence of Kformlets {f1. . . fK}, drawn

10

Figure 5: Shapes generated by random formlet composition over the unit circle. The ﬁrst

two rows show the result of applying 5 successive random formlets. The asterisk and circle

indicate formlet location ζand scale σ, respectively. The bottom row shows some example

shapes produced from the composition of many random formlets.

from a formlet dictionary D, that best approximate Γobs(t) by minimizing some

reconstruction error ξ. Here we measure error as the L2norm of the residual:1

ξ(Γobs,ΓK) = Γobs(t)−ΓK(t)

2

2

=Z1

0Γobs(t)−ΓK(t)(Γobs (t)−ΓK(t))dt.

(11)

1For notational simplicity, we treat contours as continuous functions of arc length t. In

practice, we represent contours as 128-point vectors. All integrals map to summations in a

straightforward manner.

11

4. Formlet Pursuit

4.1. Dictionary Method

As a ﬁrst attempt to estimate the optimal formlet sequence {f1. . . fK}, we

propose a version of matching pursuit for sparse approximation [25], replacing

the linear summation of elements by a non-commutative composition of formlet

components. Algorithm 1 shows the ﬂow of the formlet pursuit algorithm.

Algorithm 1: Formlet Pursuit of Γobs.

Initialization: deﬁne Γ0=AΓ0+z0to be a best matching ellipse

approximating Γobs

for k= 1, . . . , K do

Optimal Formlet: compute maximal error reducing transformation

fk= argmin

f∈D

ξ(Γobs, f (Γk−1))

Update Approximation: apply optimal formlet

Γk=fk(Γk−1)

Initialization. Given an observed target shape Γobs, we initialize the model as

a 128-point polygon sampling the unit circle, and form a 1:1 correspondence

between the model and target points that remains ﬁxed throughout pursuit.

We next apply an aﬃne transformation to the model to generate an embryonic

elliptical shape Γ0minimizing the L2error ξΓobs, f(Γ0).

Formlet Selection. At iteration kof the formlet pursuit algorithm, we select

the formlet fk(z;ζk, σk, αk) that, when applied to the current model Γk−1,

maximally reduces the approximation error:

fk= argmin

f∈D

ξ(Γobs, f (Γk−1)).(12)

This is a diﬃcult non-convex optimization problem, and experimentation

with gradient descent methods has shown that the formlet parameter space can

have many local minima. One saving grace is that the formlet transformation

is linear with respect to the gain α, allowing αto be recovered analytically.

Speciﬁcally, consider an alternative but equivalent representation of the formlet

described by Equation 3:

f(z;ζ, σ, α) = z+α·g(z−ζ , σ) where

g(zζ;σ) = zζ

|zζ|sin 2π|zζ|

σexp −|zζ|2

σ2.(13)

In Appendix Appendix A we show that, if we ﬁx both the formlet location

ζand scale σ, the optimal unconstrained gain α∗for formlet fkis given by

12

α∗=Γobs −Γk−1, g Γk−1−ζ;σdt

g(Γk−1−ζ;σ)

2

2

.(14)

where h·,·i is the inner product on functions f: [0,1] →Cgiven in equation

A.3.

One complication is that Equation 14 may yield a gain value α∗that does

not satisfy the diﬀeomorphism constraint given by Equation 7. However, from

Equations 3 and 11 it can be seen that the error is a quadratic function of the

gain α. Thus the optimal constrained gain α∗

cfor given ζand σparameters is

simply the optimal unconstrained gain α∗expressed by Equation 14, thresholded

by the diﬀeormorphism constraints:

α∗

c=

αlfor α∗< αl

α∗for αl≤α∗≤αu

αufor α∗> αu,

(15)

where

αl=−(2π)−1σ

and

αu≈0.1956σ.

Thus search for the optimal formlet can proceed by sampling from a dictio-

nary over location ζand scale σparameters, computing the optimal constrained

gain α∗

cin each case, and then selecting the resulting formlet that yields mini-

mum error.

Figure 6 shows an example of formlet pursuit with this dictionary on an

example animal shape.

Figure 6: Formlet pursuit of an example animal shape. We ﬁrst show the initial unit

circle, followed by the least-squares ellipse embryo Γ0(t), and the models Γk, where k=

1,2,3,4,8,16,32.The last curve shows the model Γ32 without the target curve Γobs.

13

4.2. Dictionary Descent Method

While the formlet pursuit method has the advantage of simplicity, it is far

from optimal, as it ignores most smoothness properties that the error function

might enjoy, aside from the quadratic dependence upon the gain α. As a con-

sequence one must face the tradeoﬀ between accuracy, which requires that the

parameter space be sampled ﬁnely, and speed, which limits the capacity of the

dictionary.

We can potentially improve upon the standard dictionary method by em-

ploying a smaller dictionary, and initiating a local gradient descent search from

the mmost promising formlets to determine the formlet parameters that locally

minimize the error function.

Figure 7 compares pursuit for the standard dictionary and dictionary descent

methods on a particular example animal shape: the higher accuracy of the

dictionary descent method is evident. Table 1 shows the performance of the two

methods on the entire shape dataset. The dictionary descent method improves

accuracy by roughly 11%, and runs about 42% faster than standard pursuit. We

use the dictionary descent method in our evaluation below. An implementation

is available at www.elderlab.yorku.ca/formlets.

Figure 7: Pursuit of an example animal shape with standard dictionary search (top row) and

dictionary descent (bottom row) for K=1,2,4,8,16.

Table 1: Comparison of Dictionary and Dictionary Descent methods on entire animal dataset.

Optimization Method L2Error Run Time (min)

Dictionary 0.00535 1.9

Dictionary Descent 0.00476 1.1

5. Implementation Details

5.1. Shape Dataset

To explore the inverse problem of constructing formlet representations of

planar shapes, we employ a database consisting of 391 blue-screened images of

14

animal models from the Hemera Photo-Object database. The boundary of each

object was sampled at 128 points at regular arc-length intervals. Each resulting

polygon was then shifted to have 0 mean and scaled to have unit L2norm in

both vertical and horizontal directions:

Z1

0

Re(Γobs(t))2dt =Z1

0

Im(Γobs(t))2dt = 1.(16)

This scaling generally alters the aspect ratio of the shape: we invert this distor-

tion when displaying our results. The full dataset of object shapes used in this

paper is available at www.elderlab.yorku.ca/formlets.

5.2. Dictionary Method: Discretization

To evaluate this formlet pursuit algorithm, we constructed a dictionary con-

sisting of a regular sampling of the position parameter ζon a 64 ×64 grid

roughly 4 times the extent of the average shape, and the scale parameter σat

16 regularly-spaced values over (0,0.8].

5.2.1. Tuning the Dictionary Descent Method

Since our objective function is the L2-norm of the residual error between

the observed curve and the approximation, we employed the MATLAB function

lsqnonlin(), which is optimized for non-linear least squares problems, and com-

pute the Jacobian of the objective function analytically (Appendix Appendix

B). We tuned the parameters of our Dictionary Descent method in stepwise

fashion. First, we determined appropriate values for the tolerance parameters

xT ol and f T ol of lsqnonlin(), which determine the stopping criteria for the

parameters and error function, respectively. We employed a sparse dictionary,

sampling the position parameter ζon a 16 ×16 grid, and the scale parameter σ

at 4 regularly-spaced values over (0,0.8]. We initiated descent at the m= 100

lowest error solutions. Using a small subset of our animal dataset containing

only four animal shapes, we performed a grid search in log space over the xT ol

and f T ol parameters in the range 10−1to 10−9, computing the average run-

ning time and L2error for a 32-formlet approximation. All experiments were

conducted on a Power Mac G5 with a 2.66 Ghz quad-core Intel Xeon processor,

running MATLAB R2009b.

The results are shown in Table 2. Error was found to be minimized for

parameter values of xT ol = 10−3, f T ol = 10−6: we used these values for all

further experiments.

Second, we optimized the density of the dictionary and number mof dic-

tionary formlets selected for descent, using the descent parameters optimized

above, the same 4 training shapes, and 32-formlet approximation. The running

time and accuracy results are shown in Tables 3 and 4 respectively. Sampling

ζon a 51 ×51 grid, the scale parameter σat 13 values, and launching m= 25

descents from the most promising formlets, we found that for these four training

images we could improve the accuracy over the standard dictionary method by

a factor of more than two, while saving roughly 30% in computation time.

15

Table 2: Average L2error (×100) for a 32-formlet model, as a function of the gradient descent

termination criteria.

fTol/xTol 1E-01 1E-02 1E-03 1E-04 1E-05 1E-06 1E-07 1E-08 1E-09

1E-01 6.86 6.31 6.54 6.54 6.54 6.54 6.54 6.54 6.54

1E-02 5.68 5.26 5.11 5.02 5.02 5.02 5.02 5.02 5.02

1E-03 5.91 4.66 4.04 4.16 4.16 4.16 4.16 4.16 4.16

1E-04 5.80 3.65 4.02 3.72 3.72 3.72 3.72 3.72 3.72

1E-05 5.75 3.69 3.66 3.73 3.90 3.90 3.90 3.90 3.90

1E-06 5.75 3.69 3.52 3.73 3.74 3.74 3.74 3.74 3.74

1E-07 5.75 3.69 3.54 3.71 3.66 3.66 3.66 3.66 3.66

1E-08 5.75 3.69 3.54 3.67 3.66 3.66 3.66 3.66 3.66

1E-09 5.75 3.69 3.54 3.67 3.66 3.66 3.66 3.66 3.66

Table 3: Average running time per shape (min) for a 32-formlet model, as a function of

dictionary size nand number of descents m.

m/n 642×16 582×14 512×13 452×11 382×10 322×8 262×6 192×5

0 1.28 0.93 0.68 0.45 0.29 0.16 0.09 0.04

1 1.38 1.01 0.74 0.50 0.33 0.21 0.11 0.07

5 1.44 1.06 0.78 0.55 0.38 0.24 0.18 0.13

10 1.49 1.10 0.84 0.60 0.44 0.31 0.25 0.20

15 1.56 1.17 0.90 0.67 0.50 0.37 0.32 0.28

20 1.60 1.23 0.95 0.73 0.56 0.42 0.41 0.35

25 1.67 1.28 1.01 0.78 0.62 0.48 0.47 0.42

30 1.72 1.34 1.07 0.84 0.68 0.55 0.53 0.51

Table 4: Average residual (×1000) for a 32-formlet model, as a function of dictionary size n

and number of descents m.

m/n 642×16 582×14 512×13 452×11 382×10 322×8 262×6 192×5

0 8.0 6.5 8.1 9.5 9.3 14.0 23.5 28.1

1 3.7 4.7 4.8 5.8 5.0 5.9 7.4 14.0

5 3.6 3.8 4.7 5.9 4.2 6.4 7.4 7.1

10 4.1 3.7 3.8 45 4.5 6.1 6.9 6.4

15 3.6 3.9 4.1 4.0 4.1 5.7 6.7 8.0

20 3.4 3.8 3.8 4.5 4.1 4.8 6.6 8.0

25 3.7 3.8 3.7 4.5 4.2 4.8 6.5 7.4

30 3.4 3.9 3.7 4.4 4.2 4.4 5.7 7.5

16

Interestingly, we found that tightening tolerance parameters, increasing the

dictionary density, or increasing the number of deployments of the optimizer did

not always decrease the error. However, at a given iteration, error did decrease

monotonically as a function of each of these parameters, as expected. Thus the

non-monotonic variation in error with these parameters appears to reﬂect the

non-optimality of the greedy pursuit algorithm. In other words, selecting the

formlet that minimizes the residual at stage iwill not necessarily lead to the

smallest error at stage k > i.

6. Evaluation

To evaluate and compare shape models, we address the problem of contour

completion, using our animal shape dataset. In natural scenes, object bound-

aries are often fragmented by occlusion and loss of contrast: contour completion

is the process of ﬁlling in the missing parts. Completion can also be an impor-

tant component of perceptual organization algorithms: given one or more partial

contour hypotheses, completion can be used to estimate the locations of missing

parts. These estimate can then guide search for corroborating evidence.

We compare our formlet model with the shapelet model described in Section

2.1 [9]. For each shape in the dataset, we simulate the occlusion of a 10% or

30% continuous section of the contour, and allow the two methods to pursue

only the remaining visible portion.

The rate of convergence of both formlet and shapelet methods depends upon

how the parameters are sampled. For formlet pursuit, we use the dictionary de-

scent method described in Section 4.2. For the shapelet method, we used the

standard dictionary method of Dubinskiy et al. [9], optimizing performance by

sampling as ﬁnely as possible given time constraints. The shapelet representa-

tion assumes an arc-length representation of the curves on t∈[0,1], and each

shapelet component has an arc-length position µand scale σ. We sampled the

position parameter µat 128 regularly-spaced values over [0,1], and the scale

parameter σat 128 regularly-spaced values over (0,1]. The aﬃne parameters

were computed analytically [9].

The formlet and shapelet pursuit algorithms were initialized with the same

embryonic ellipses, and were governed by a minimization of the L2error (Equa-

tion 11) over the visible points of the curves only. While pursuit is based on

a ﬁxed 1:1 correspondence between points on the target and model curves, we

measure performance using the L2Hausdorﬀ distance to avoid potential depen-

dence of the evaluation upon the parameterization of the curves. Speciﬁcally,

we deﬁne the error between the target shape and the model as the average

minimum distance of a point on one of the shapes to the other shape:

ξH(Γobs,Γk) = sZ1

0

1

2min

t0∈[0,1) |Γobs(t)−Γk(t0)|2+ min

t0∈[0,1) |Γobs(t0)−Γk(t)|2dt.

(17)

17

We measured the residual error between the model and target for both the

visible and occluded portions of the shapes. Performance on the occluded por-

tion, where the model is under-constrained by the data, reveals how well the

structure of the model captures properties of natural shapes.

Implementations for both the formlet and shapelet models are available at

www.elderlab.yorku.ca/formlets.

6.1. Results

Figure 8 shows some example qualitative results for this experiment. While

shapelet pursuit introduces topological errors in both visible and occluded re-

gions, formlet pursuit remains topologically valid, as predicted.

Figure 8: Examples of 30% occlusion pursuit with shapelets (red) and formlets (blue) for

k= 0,2,4,8,16,32. Solid lines indicate visible contour, dashed lines indicate occluded contour.

Figure 9 shows quantitative results for this experiment. While the shapelet

and formlet models achieve comparable error on the visible portions of the

boundaries, the error is substantially lower for the formlet representation on

the occluded portions. This suggests that the structure of the formlet model

better captures regularities in the shapes of natural objects. We believe that the

two principal reasons for this are a) respecting the topology of the shape prunes

oﬀ many inferior completion solutions and b) by working in the image space,

rather than arc length, the formlet model is better able to capture important

regional properties of shape.

18

Formlet Occluded

Formlet Visible

Shapelet Occluded

Shapelet Visible

30% Occlusion

Normalized RMS Error

Number of Components

10% Occlusion

Normalized RMS Error

Number of Components 010 20 30

0 10 20 30 0

0.01

0.02

0.03

0.04

0.05

0

0.01

0.02

0.03

0.04

0.05

0.06

Figure 9: Results of occlusion pursuit evaluation. Black denotes error for Γ0(t), the aﬃne-ﬁt

ellipse.

7. Discussion

7.1. Formlet Parameter Distributions

The focus of this paper is to establish the appropriate structural properties

for a generative model of planar shape. To ultimately apply this representation

to problems such as object detection and recognition, statistical models over this

representation must be developed. One small step is to consider the distribution

of formlet parameters selected in pursuit of the shapes in our animal dataset.

Figure 10 shows how the means of the formlet parameters vary as pursuit

unfolds. We observe that scales decrease over time (a), reﬂecting a coarse-to-ﬁne

approximation. Gains also decrease over time (b), although when normalized

by scale (c), this decline is moderated substantially. Finally, formlet locations

are biased to the centre of the shape and are roughly isotropic (d), with a slight

bias to the lower ﬁeld, presumably reﬂecting the additional details required to

represent the legs of the animals.

7.2. Alternative Formlet Bases

In this paper we have chosen a particular Gabor-like formlet representation

(Equation 3) that confers several key properties:

1. The family of formlets forms a self-similar scale space.

2. Each formlet acts within a σ-ball around a speciﬁc location ζ, converging

to the identity as |z−ζ|→∞.

3. The mapping is smooth everywhere except at ζ, where it is C0.

4. Deformation is isotropic and radial around ζ.

There are of course other formulations that would also satisfy these prop-

erties. Here we consider two speciﬁc alternatives and compare them with the

Gabor formulation.

19

Mean scale (σ)

Formlet number k

020 40

0

0.1

0.2

0.3

0.4

(a) Mean scale at each iteration

Mean absolute gain (|α|)

Formlet number k

0 20 40

0

0.02

0.04

0.06

(b) Mean gain at each iteration (expansive

formlets)

Mean abs gain:scale (|α|/σ)

Formlet number k

020 40

0.05

0.1

0.15

0.2

0.25

(c) Mean gain at each iteration (compressive

formlets)

Im(ζ)

Re(ζ)

−0.4−0.20 0.2

−0.4

−0.2

0

0.2

0.4

(d) Location histogram

Figure 10: Marginal distributions of formlet parameters. Error bars indicate standard error

of the mean.

7.3. Gaussian Derivative Formlets

We simplify the original Gabor formulation of Equation 3 by replacing the

sinusoidal factor with a ﬁrst-order Taylor series approximation, yielding:

f(z;ζ, σ, α) = ζ+z−ζ

|z−ζ|ρ(|z−ζ|),where (18)

ρ(r) = r+α2πr

σexp −r2

σ2(19)

Note that the deformation term of the radial deformation function ρ(r) is

proportional to the ﬁrst Gaussian derivative in r.

fis a diﬀeomorphism iﬀ ρ0(r)>0 everywhere:

ρ0(r) = 1 + exp −r2

σ22πα

σ1−2r2

σ2>0.(20)

20

For α < 0, the minimum is attained when r= 0:

⇒α > −1

2πσ(21)

For α > 0, by solving ρ00(r) = 0 it can be shown that the minimum is attained

when r=p3/2σ. Substituting into Equation 20 then yields

α < exp(3/2)

4πσ(22)

Thus fis a diﬀeomorphism iﬀ α∈σ

2π−1,1

2exp(3/2).

7.4. Spline Formlets

Both the Gabor and Gaussian formlets have inﬁnite support, which increases

computation time and limits the degree to which formlets can be computed in

parallel. To achieve strictly compact support we impose the constraint that

ρ(r;σ) = r⇐⇒ f(z) = zwhenever r > σ. To guarantee smoothness, we

require ρ(σ;σ) = σand ρ0(σ;σ) = 1 and to achieve continuity at ζwe require

ρ(0) = 0. The simplest spline meeting all these conditions is:

ρ(r;σ) = (r+αr

σ2(r−σ)2for r≤σ

rfor r > σ (23)

We derive the diﬀeomorphism constraints as before:

ρ0(r) = 1 + α

σ2(r−σ)2+r·2(r−σ)>0 (24)

⇒α

σ23r2−4rσ +σ2>−1 (25)

For α < 0, the minimum is attained when r= 0, yielding α > −1.

For α > 0, by solving ρ00(r) = 0 it can be shown that the minimum is

attained when r= 2σ/3. Substituting into Equation 24 then yields α < 3.

Thus fis a diﬀeomorphism iﬀ α∈(−1,3)

7.5. Comparison of Formlet Bases

Figures 11 - 12(b) show the radial deformation functions, examples of pursuit

and rate of convergence for these three diﬀerent formulations. Empirically, we

ﬁnd that the Gabor formulation achieves a better rate of convergence on the

animal dataset than the competing formulations, although at this stage we do

not have a clear theoretical explanation for this result.

21

Spline

Gaussian

Gabor

ρ(r)

r

0 10 20

0

5

10

15

20

Figure 11: Radial deformation function for three formlet bases.

8. Conclusion

We have developed a novel generative model of planar shape that satisﬁes

a number of essential properties. In this model, complex shapes are seen as

the evolution of a simple embryonic shape by successive application of simple

diﬀeomorphic transformations of the plane called formlets. The system is both

complete and closed, since arbitrary shapes can be modeled, and generated

shapes are guaranteed to be topologically valid. This means that the model has

the potential to support accurate probabilistic modeling. We have demonstrated

a novel dictionary descent formlet pursuit algorithm that selects formlets to ef-

ﬁciently approximate given target shapes. Evaluation of the formlet pursuit

model on the problem of shape completion revealed that the model is better

able to approximate parts of shapes missing due to occlusion than a competing

contour-based method. Our animal object dataset, experimental results, exam-

ple movies and implementations for both the formlet and shapelet models are

available at www.elderlab.yorku.ca/formlets.

Future Work. We hope to extend the present work in a number of ways. First,

we would like to generalize our deﬁnition of formlets to allow for anisotropic

deformation that could eﬃciently model elongated parts such as animal limbs.

Second, we would like to develop probabilistic models over the formlet repre-

sentation. Finally, we are interested in using the formlet pursuit algorithm for

contour grouping, using detected fragments to generate predictions for where

other fragments of the same object boundary might be found.

22

(a) Pursuit of an example shape with Gabor (blue), Gaussian (green) and spline (red) bases for

K=1,2,4,8,16.

Spline

Gaussian

Gabor

Normalized RMS Error

Number of Components

0 10 20 30

0.005

0.01

0.015

0.02

(b) Mean L2Hausdorﬀ error for formlet pursuit over animal dataset

with three diﬀerent formlet bases.

Figure 12: Comparison of the three diﬀerent formlet bases.

23

References

[1] S. Belongie, J. Malik, J. Puzicha, Shape matching and object recognition

using shape contexts, Pattern Analysis and Machine Intelligence, IEEE

Trans. 24 (2002) 509–522.

[2] P. Cavanagh, What’s up in top-down processing, in: A. Gorea (Ed.),

Representations of Vision, Cambridge University Press, Cambridge, UK,

1991 edition, 1991, pp. 295–304.

[3] D. Mumford, Mathematical theories of shape: do they model perception?,

in: B. C. Vemuri (Ed.), Geometric Methods in Computer Vision, volume

1570, SPIE, 1991, pp. 2–10.

[4] F. Attneave, Some informational aspects of visual perception, Psychol.

Rev. 61 (1954) 183–193.

[5] T. F. Cootes, C. J. Taylor, D. H. Cooper, J. Graham, Active shape models

- their training and application, Comput. Vis. Image Underst. 61 (1995)

38–59.

[6] T. Pavlidis, Structural pattern recognition, volume 1, Springer-Verlag,

Berlin, illustrated edition, 1977.

[7] D. D. Hoﬀman, W. A. Richards, Parts of recognition, Cognition 18 (1984)

65–96.

[8] F. Mokhtarian, A. Mackworth, Scale-based description and recognition of

planar curves and two-dimensional shapes, Pattern Analysis and Machine

Intelligence, IEEE Trans. 8 (1986) 34–43.

[9] A. Dubinskiy, S. Zhu, A multiscale generative model for animate shapes

and parts, in: Proc. 9th IEEE ICCV, volume 1, pp. 249–256.

[10] H. Blum, Biological shape and visual science (part i), J. Theoretical Biology

38 (1973) 205–287.

[11] H. Blum, R. N. Nagel, Shape description using weighted symmetric axis

features, Pattern Recognition 10 (1978) 167 – 180.

[12] M. Brady, H. Asada, Smoothed local symmetries and their implementation,

Int. J. Robotics Res. 3 (1984) 36–61.

[13] M. Leyton, A process-grammar for shape, Artiﬁcial Intelligence 34 (1988)

213 – 247.

[14] B. B. Kimia, A. R. Tannenbaum, S. W. Zucker, Shapes shocks and de-

formations i: the components of two dimensional shape and the reaction

diﬀusion space, Int. J. Comput. Vision 15 (1995) 189–224.

24

[15] S. Osher, J. A. Sethian, Fronts propagating with curvature-dependent

speed, J. Comput. Phys. 79 (1988) 12–49.

[16] N. Trinh, B. Kimia, A symmetry-based generative model for shape, in:

Proc. 11th IEEE ICCV, pp. 1–8.

[17] A. Pentland, S. Sclaroﬀ, Closed-form solutions for physically based shape

modeling and recognition, Pattern Analysis and Machine Intelligence,

IEEE Transactions on 13 (1991) 715–729.

[18] S. Scarloﬀ, A. Pentland, Modal matching for correspondence and recogni-

tion., Pattern Analysis and Machine Intelligence, IEEE Trans. 17 (1995)

545–561.

[19] S.-C. Zhu, Embedding gestalt laws in markov random ﬁelds, Pattern

Analysis and Machine Intelligence, IEEE Trans. 21 (1999) 1170–1187.

[20] D. W. Thompson, On growth and form, Cambridge University Press, Cam-

bridge, UK, abridged ed./edited edition, 1961.

[21] A. Jain, Y. Zhong, S. Lakshmanan, Object matching using deformable

templates, Pattern Analysis and Machine Intelligence, IEEE Trans. 18

(1996) 267–278.

[22] E. Sharon, D. Mumford, 2d-shape analysis using conformal mapping, Com-

puter Vision and Pattern Recognition, IEEE Comp. Soc. Conf. 2 (2004)

350–357.

[23] U. Grenander, A. Srivastava, S. Saini, A pattern-theoretic characterization

of biological growth, Medical Imaging, IEEE Trans. 26 (2007) 648–659.

[24] T. Oleskiw, J. Elder, G. Peyr´e, On growth and formlets, Proceedings of the

IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

(2010).

[25] S. Mallat, Z. Zhang, Matching pursuits with time frequency dictionaries,

Signal Processing, IEEE Trans. 41 (1993) 3397–3415.

Appendix A. Computation of Optimal Gain

Since the formlet deformation of Equation 3 is linear in the gain α, given

ﬁxed location ζand scale σparameters, the gain that minimizes the L2deviation

from the target shape can be computed analytically. Speciﬁcally, suppose that

the observed curve Γobs is currently approximated by Γk−1. For given formlet

location and scale parameters ζand σ, we deﬁne the optimal unconstrained gain

α∗for formlet fkas:

α∗= argmin

α∈R

ξ(Γobs, f (Γk−1;ζ, σ, α)) (A.1)

25

where, for curves aand b,ξ(Γa,Γb) denotes the L2error metric

Z1

0

Re Γa(t)−Γb(t)2+Im Γa(t)−Γb(t)2dt. (A.2)

induced by the inner product:

Γa,Γb=Z1

0

Re Γa(t)Re Γb(t) + Im Γa(t)Im Γb(t)dt (A.3)

Using Equation 13, we diﬀerentiate ξwith respect to αand set to zero:

∂

∂α kΓobs −f(Γk−1)k2=∂

∂α kΓres −αgk2(A.4)

=∂

∂α kΓresk2−2αhΓres , gi+α2kgk2(A.5)

= 2(hΓres, g i − αkgk2) = 0 (A.6)

⇒α=hΓres, g i

kgk2

where we used the shorthand g=gΓk−1−ζ;σ,Γres = Γobs −Γk−1.

As a result, given ﬁxed ζand σ, the optimal unconstrained gain α∗that

maximally reduces the L2error between the observed curve Γobs and current

approximation Γk−1is given by

α∗=Γobs −Γk−1, g Γk−1−ζ;σdt

g(Γk−1−ζ;σ)

2

2

.(A.7)

Note that in general Equation A.7 may produce an optimal gain outside

the diﬀeomorphism bounds of Equation 7. However, the optimal gain that

satisﬁes the constraint is simply the unconstrained gain α∗thresholded by the

diﬀeomorphism constraints, as described in Section 4.1.

Appendix B. Jacobian Computation for Nonlinear Least Squares Min-

imization

The dictionary descent optimization method described in Section 4.2 em-

ploys the MATLAB gradient descent method lsqnonlin to determine the loca-

tion parameter ζand scale parameter σ.lsqnonlin uses the Jacobian of the

error function in the unknown parameters to iterate toward the local minimum.

The method performs best if an analytic form of the Jacobian can be provided.

Note that since the optimal gain α∗

cis determined analytically (Equation 15),

this value must be used in all computations of the Jacobian in order to determine

locally optimally values for the other parameters.

26

Combining Equations 11 and 13, and using r=Γk−1−ζ, the error function

can be written as

ξ(Γobs,ΓK) = k(Γobs −f(Γk−1)k2

=

Γobs −Γk−1−α∗

c

Γk−1−ζ

rsin 2πr

σexp −r2

σ2

2

.

Now deﬁning

Γres = Γobs −Γk−1

and

G=G(r;σ) = 1

rsin 2πr

σexp −r2

σ2,

and using xand ysubscripts to denote real and imaginary components, we can

rewrite this expression as

ξ(Γobs,ΓK) = Z1

0(Γres

x(t)) −α∗

c(Γk−1

x(t)−ζx)G2dt

+Z1

0(Γres

y(t)) −α∗

c(Γk−1

y(t)−ζy)G2dt

≡Z1

0ξx(t)2+ξy(t)2dt.

Since the error is a function of the optimal gain α∗

cand α∗

cis a function of

the location parameter ζand the scale parameter σ, we will need the partial

derivative of α∗

cwith respect to these two parameters. From Equation 15, we

have

α∗

c=

αlfor α∗< αl

α∗for αl≤α∗≤αu

αufor α∗> αu,

where

αl=−(2π)−1σ

and

αu≈0.1956σ,

and α∗is given by Equation 14. Thus we have

∂α∗

c

∂σ =

−(2π)−1for α∗< αl

∂α∗

∂σ for αl≤α∗≤αu

0.1956 for α∗> αu

27

∂α∗

c

∂ζx

=

0 for α∗< αl

∂α∗

∂ζx

for αl≤α∗≤αu

0 for α∗> αu

∂α∗

c

∂ζy

=

0 for α∗< αl

∂α∗

∂ζy

for αl≤α∗≤αu

0 for α∗> αu

Thus to determine the partial derivatives of the constrained gain α∗

c, we must

compute the partial derivatives of the unconstrained gain α∗, which is deﬁned

by Equation 14:

α∗=Γobs −Γk−1, g Γk−1−ζ;σdt

g(Γk−1−ζ;σ)

2

2

.

where we have used

g(Γk−1−ζ;σ) = (Γk−1−ζ)1

rsin 2πr

σexp −r2

σ2= (Γk−1−ζ)G(r).(B.1)

Computing the partial derivatives with respect to the scale σparameter and

location parameters ζxand ζy, we obtain:

∂α∗

∂σ =∂

∂σ hΓres , gikgk2− hΓres , gi∂

∂σ kgk2

kgk4, where:

∂

∂σ hΓres, gi=∂

∂σ Z1

0

Γres

x(t)(Γk−1

x(t)−ζx)G+ Γres

y(t)(Γk−1

y(t)−ζy)Gdt

=Z1

0Γres(t),Γk−1(t)−ζ∂ G

∂σ dt

∂

∂σ kgk=∂

∂σ Z1

0

[(Γk−1

x(t)−ζx)G]2+ [(Γk−1

y(t)−ζy)G]2dt

=Z1

0

2G∂G

∂σ kΓk−1(t)−ζk2

∂α∗

∂ζx

=∂

∂ζxhΓres , gikgk2− hΓres , gi∂

∂ζxkgk2

kgk4, where:

28

∂

∂ζx

hΓres, g i=∂

∂ζxZ1

0

Γres

x(t)(Γk−1

x(t)−ζx)G+ Γres

y(t)(Γk−1

y(t)−ζy)Gdt

=Z1

0

Γres

x(t)−G+ (Γk−1

x(t)−ζx)∂G

∂ζx+ Γres

y(t)(Γk−1

y(t)−ζy)∂G

∂ζxdt

∂

∂ζx

kgk2=∂

∂ζxZ1

0

[(Γk−1

x(t)−ζx)G]2+ [(Γk−1

y(t)−ζy)G]2dt

=Z1

0

2(Γk−1

x(t)−ζx)G−G+ (Γk−1

x(t)−ζx)∂G

∂ζx+ 2G∂G

∂ζx

(Γk−1

y(t)−ζy)2dt

∂α∗

∂ζy

=∂

∂ζyhΓres , gikgk2− hΓres , gi∂

∂ζykgk2

kgk4, where:

∂

∂ζy

hΓres, g i=∂

∂ζyZ1

0

Γres

x(t)(Γk−1

x(t)−ζx)G+ Γres

y(t)(Γk−1

y(t)−ζy)Gdt

=Z1

0

Γres

x(t)(Γk−1

x(t)−ζx)∂G

∂ζy+ Γres

y(t)−G+ (Γk−1

y(t)−ζy)∂G

∂ζydt

∂

∂ζy

kgk2=∂

∂ζyZ1

0

[(Γk−1

x(t)−ζx)G]2+ [(Γk−1

y(t)−ζy)G]2dt

=Z1

0

2G∂G

∂ζy

(Γk−1

x(t)−ζx)2+ 2(Γk−1

y(t)−ζy)G−G+ (Γk−1

y(t)−ζy)∂G

∂ζydt

We are now ready to compute the Jacobian matrix. From Equation B.1 we have

that:

ξx(ti)=Γres

x(ti)−αΓk−1

x(ti)−ζxG

ξy(ti)=Γres

y(ti)−αΓk−1

y(ti)−ζyG

Thus,

∂ξx(ti)

∂σ =−(Γk−1

x(ti)−ζx)∂α

∂σ G+α∂G

∂σ (B.2)

∂ξy(ti)

∂σ =−(Γk−1

y(ti)−ζy)∂α

∂σ G+α∂G

∂σ (B.3)

∂ξx(ti)

∂ζx

=−∂α

∂ζx

(Γk−1

x(ti)−ζx)G+αG −α(Γk−1

x(ti)−ζx)∂G

∂ζx

(B.4)

∂ξy(ti)

∂ζx

=−(Γk−1

y(ti)−ζy)∂α

∂ζx

G+α∂G

∂ζx(B.5)

∂ξx(ti)

∂ζy

=−(Γk−1

x(ti)−ζx)∂α

∂ζy

G+α∂G

∂ζy(B.6)

∂ξy(ti)

∂ζy

=−∂α

∂ζy

(Γk−1

y(ti)−ζy)G+αG −α(Γk−1

y(ti)−ζy)∂G

∂ζy

(B.7)

29

where for the Gabor basis, we have:

∂G

∂σ = exp −r2

σ2−2π

σ2cos 2πr

σ+2r

σ3sin 2πr

σ

∂G

∂ζx

= exp −r2

σ2(Γk−1

x(ti)−ζx)

r−2π

σr cos 2πr

σ+2

σ2sin 2πr

σ+1

r2sin 2πr

σ

∂G

∂ζy

= exp −r2

σ2(Γk−1

y(ti)−ζy)

r−2π

σr cos 2πr

σ+2

σ2sin 2πr

σ+1

r2sin 2πr

σ

It is straightforward to show that Equations B.2 - B.7 also apply to the

Gaussian and Spline bases (Section 7.2), with suitable deﬁnitions of G(r;σ):

Gaussian Basis:

G(r;σ) = 2π

σexp −r2

σ2

∂G

∂σ = 2πexp −r2

σ2−1

σ2+2r2

σ4

∂G

∂ζx

=4π

σ3exp −r2

σ2(Γk−1

x−ζx)

∂G

∂ζy

=4π

σ3exp −r2

σ2(Γk−1

y−ζy)

Spline Basis:

G(r;σ) = (r−σ)2

σ2

∂G

∂σ =−2

σ3(r−σ)2−2

σ2(r−σ)

∂G

∂ζx

=−2

rσ2(r−σ)(Γk−1

x−ζx)

∂G

∂ζy

=−2

rσ2(r−σ)(Γk−1

y−ζy)

30