Page 1

Spherical Demons: Fast Diffeomorphic Landmark-Free Surface

Registration

B.T. Thomas Yeo*,

Computer Science and Artificial Intelligence Laboratory, Department of Electrical Engineering and

Computer Science, Massachusetts Institute of Technology, Cambridge, USA

Mert R. Sabuncu*,

Computer Science and Artificial Intelligence Laboratory, Department of Electrical Engineering and

Computer Science, Massachusetts Institute of Technology, Cambridge, USA

Tom Vercauteren,

Mauna Kea Technologies, Paris, France

Nicholas Ayache,

Asclepios Group, INRIA, Sophia Antipolis, France

Bruce Fischl, and

Computer Science and Artificial Intelligence Laboratory, Department of Electrical Engineering and

Computer Science, Massachusetts Institute of Technology, Cambridge, USA; Department of

Radiology, Harvard Medical School, Charlestown, USA and the Divison of Health Sciences and

Technology, Massachusetts Institute of Technology, Cambridge, USA

Polina Golland

Computer Science and Artificial Intelligence Laboratory, Department of Electrical Engineering and

Computer Science, Massachusetts Institute of Technology, Cambridge, USA

B.T. Thomas Yeo: ythomas@csail.mit.edu; Mert R. Sabuncu: msabuncu@csail.mit.edu; Tom Vercauteren:

tom.vercauteren@maunakeatech.com; Nicholas Ayache: nicholas.ayache@sophia.inria.fr; Bruce Fischl:

fischl@nmr.mgh.harvard.edu; Polina Golland: polina@csail.mit.edu

Abstract

We present the Spherical Demons algorithm for registering two spherical images. By exploiting

spherical vector spline interpolation theory, we show that a large class of regularizors for the

modified Demons objective function can be efficiently approximated on the sphere using iterative

smoothing. Based on one parameter subgroups of diffeomorphisms, the resulting registration is

diffeomorphic and fast. The Spherical Demons algorithm can also be modified to register a given

spherical image to a probabilistic atlas. We demonstrate two variants of the algorithm

corresponding to warping the atlas or warping the subject. Registration of a cortical surface mesh

to an atlas mesh, both with more than 160k nodes requires less than 5 minutes when warping the

atlas and less than 3 minutes when warping the subject on a Xeon 3.2GHz single processor

machine. This is comparable to the fastest non-diffeomorphic landmark-free surface registration

algorithms. Furthermore, the accuracy of our method compares favorably to the popular

FreeSurfer registration algorithm. We validate the technique in two different applications that use

registration to transfer segmentation labels onto a new image: (1) parcellation of in-vivo cortical

surfaces and (2) Brodmann area localization in ex-vivo cortical surfaces.

*B.T. Thomas Yeo and Mert R. Sabuncu contributed equally to this paper.

NIH Public Access

Author Manuscript

IEEE Trans Med Imaging. Author manuscript; available in PMC 2011 March 01.

Published in final edited form as:

IEEE Trans Med Imaging. 2010 March ; 29(3): 650–668. doi:10.1109/TMI.2009.2030797.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 2

Index Terms

Surface Registration; Spherical Registration; Cortical Registration; Vector Field Interpolation;

Demons; Diffeomorphism

I. Introduction

Motivated by many successful applications of the spherical representation of the cerebral

cortex, this paper addresses the problem of registering two spherical images. Cortical

folding patterns have been shown to correlate with both cytoarchitectural [25], [68] and

functional regions [64], [27]. In group studies of cortical structure and function, determining

corresponding folds across subjects is therefore important. There has been much effort

focused on registering cortical surfaces in 3D [22], [23], [30], [58]. Since cortical areas –

both structure and function – are arranged in a mosaic across the cortical surface, an

alternative approach is to model the surface as a 2D closed manifold in 3D and to warp the

underlying spherical coordinate system [27], [50], [59], [60], [64], [67]. Warping the

spherical coordinate system establishes correspondences across the surfaces without actually

deforming the surfaces in 3D.

Deformation Model

There is frequently a need for invertible deformations that preserve the topology of

structural or functional regions across subjects. Unfortunately, this causes many spherical

warping algorithms to be computationally expensive. Previously demonstrated methods for

cortical registration [27], [60], [67] rely on soft regularization constraints to encourage

invertibility. These require unfolding the mesh triangles, or limit the size of optimization

steps to achieve invertibility [27], [67]. Elegant regularization penalties that guarantee

invertibility exist [5], [46] but they explicitly rely on special properties of the Euclidean

image space that do not hold for the sphere.

An alternative approach to achieving invertibility is to work in the group of

diffeomorphisms [4], [7], [9], [22], [31], [43], [66]. In this case, the underlying theory of

flows of vector fields can be extended to manifolds [11], [44], [47]. The Large Deformation

Diffeomorphic Metric Mapping (LDDMM) [7], [9], [22], [31], [43] is a popular framework

that seeks a time-varying velocity field representation of a diffeomorphism. Because

LDDMM optimizes over the entire path of the diffeomorphism, the resulting method is slow

and memory intensive. By contrast, Ashburner [4] and Hernandez et al. [33] consider

diffeomorphic transformations parameterized by a single stationary velocity field. While

restricting the space of solutions reduces the memory needs relative to LDDMM, these

algorithms still have to consider the entire trajectory of the deformation induced by the

velocity field when computing the gradients of the objective function, leading to a long run

time. We note that recent algorithmic advances [34], [43] promise to improve the speed and

relieve the memory requirements of both LDDMM and the stationary velocity approach.

In this work, we adopt the approach of the Diffeomorphic Demons algorithm [66],

demonstrated in the Euclidean image space, which constructs the deformation space that

contains compositions of diffeomorphisms, each of which is parameterized by a stationary

velocity field. Unlike the Euclidean Diffeomorphic Demons, the Spherical Demons

algorithm utilizes velocity vectors tangent to the sphere and not arbitrary 3D vectors. This

constraint need not be taken care of explicitly in our algorithm since we directly work with

the tangent spaces. In each iteration, the method greedily seeks the locally optimal

diffeomorphism to be composed with the current transformation. As a result, the approach is

Yeo et al.

Page 2

IEEE Trans Med Imaging. Author manuscript; available in PMC 2011 March 01.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 3

much faster than LDDMM [7], [9], [22], [31] and its simplifications [4], [33]. A drawback is

that the path of deformation is no longer a geodesic in the group of diffeomorphisms.

Image Similarity vs. Regularization Tradeoffs

Another challenge in registration is the tradeoff between the image similarity measure and

the regularization in the objective function. Since most types of regularization favor smooth

deformations, the gradient computation is complicated by the need to take into account the

deformation in neighboring regions. For Euclidean images, the popular Demons algorithm

[57] can be interpreted as optimizing an objective function with two regularization terms

[14], [66]. The special form of the objective function facilitates a fast two-step optimization

where the second step handles the warp regularization via a single convolution with a

smoothing filter.

Using spherical vector spline interpolation theory [31] and other differential geometric tools,

we show that the two-stage optimization procedure of Demons can be efficiently

approximated on the sphere. We note that the problem is not trivial since tangent vectors at

different points on the sphere are not directly comparable. We also emphasize that this

decoupling of the image similarity and the warp regularization could also be accomplished

with a different space of admissible warps, e.g., spherical thin plate splines [72].

Interpolation

Yet another reason why spherical image registration is slow is because of the difficulty in

performing interpolation on a spherical grid, unlike a regular Euclidean grid. In this paper,

we use existing methods for interpolation, requiring about one second to interpolate data

from a spherical mesh of 160k vertices onto another spherical mesh of 160k vertices. Recent

work on using different coordinate charts of the sphere [63] promises to further speed up our

implementation of the Spherical Demons algorithm.

While most discussion in this paper concentrates on pairwise registration of spherical

images, the proposed Spherical Demons algorithm can be modified to incorporate a

probabilistic atlas. We derive and implement two variants of the algorithm for registration to

an atlas corresponding to whether we warp the atlas or the subject. On a Xeon 3.2GHz

single processor machine, registration of a cortical surface mesh to an atlas mesh, both with

more than 160k nodes, requires less than 5 minutes when warping the atlas and less than 3

minutes when warping the subject. Note that the registration runtime reported includes

registration components dealing with rotation, which takes up one quarter of the total

runtime. The total runtime is comparable to other nonlinear landmark-free cortical surface

registration algorithms whose runtime ranges from minutes [23], [60] to more than an hour

[27], [67]. However, the other fast algorithms suffer from folding spherical triangles [60]

and intersecting triangles in 3D [23] since only soft constraints are used. No runtime

comparison can be made with spherical registration algorithm of the LDDMM type because

to the best of our knowledge, no landmark-free LDDMM spherical registration algorithm

that handles cortical surfaces has been developed yet.

Unlike landmark-based methods for surface registration [8], [22], [31], [50], [58], [64], we

do not assume the existence of corresponding landmarks. Landmark-free methods have the

advantage of allowing for a fully automatic processing and analysis of medical images.

Unfortunately, landmark-free registration is also more challenging, because no information

about correspondences are provided. The difficulty is exacerbated for the cerebral cortex

since different sulci and gyri appear locally similar. Nevertheless, we demonstrate that our

algorithm is accurate in both cortical parcellation and cytoarchitectonic localization

applications.

Yeo et al.

Page 3

IEEE Trans Med Imaging. Author manuscript; available in PMC 2011 March 01.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 4

The Spherical Demons algorithm for registering cortical surfaces presented here does not

take into account the metric properties of the original cortical surface. FreeSurfer [27] uses a

regularization that penalizes deformation of the spherical coordinate system based on the

distortion computed on the original cortical surface. Thompson et al. [59] suggest the use of

Christoffel symbols [39] to correct for the metric distortion of the initial spherical coordinate

system during the registration process. However, it is unclear whether correcting for the

metric properties of the cortex is important in practice, since we demonstrate that the

accuracy of the Spherical Demons algorithm compares favorably to that of FreeSurfer. A

possible reason is that we initialize the registration with a spherical parametrization that

minimizes metric distortion between the spherical representation and the original cortical

surface [27].

This paper is organized as follows. In the next section, we discuss the classical Demons

algorithm [57] and its diffeomorphic variant [66]. In Section III, we present the extension of

the Diffeomorphic Demons algorithm to the sphere. We conclude with experiments in

Section IV and further discussion in Section V. The appendices provide technical and

implementation details of the Spherical Demons algorithm and the extension to probabilistic

atlases. This paper extends a previously presented conference article [69] and contains

detailed derivations and discussions that were left out in the conference version. We note

that our Spherical Demons code is freely available1. To summarize,

1.

We demonstrate that the Demons algorithm can be efficiently extended to the

sphere.

2.

We demonstrate that the use of a limited class of diffeomorphisms combined with

the Demons algorithm yields a speed gain of more than an order of magnitude

compared with other landmark-free invertible spherical registration methods, such

as [27], [67].

3.

We validate our algorithm by demonstrating an accuracy comparable to that of the

popular FreeSurfer algorithm [27] in two different applications.

II. Background - Demons Algorithm

We choose to work with the modified Demons objective function [14], [66]. Let F be the

fixed image, M be the moving image and Γ be the desired transformation that deforms the

moving image M to match the fixed image F. Throughout this paper, we assume that F and

M are scalar images, even though it is easy to extend the results to vector images [70]. We

introduce a hidden transformation ϒ and seek

Algorithm 1

Demons Algorithm

Data: A fixed image F and moving image M.

Result: Transformation Γ so that M ∘ Γ is “close” to F.

Set ϒ0 = identity transformation (or some a-priori transformation, e.g., from a previous registration)

repeat

Step 1. Given ϒ(t),

Minimize the first two terms of Eq. (3)

1There are two versions of the code (matlab and ITK) available at http://sites.google.com/site/yeoyeo02/software/

sphericaldemonsrelease. The matlab code is used in the experiments of this paper. The preliminary ITK code [35], [36], [37] can also

be found at http://hdl.handle.net/10380/3117.

Yeo et al. Page 4

IEEE Trans Med Imaging. Author manuscript; available in PMC 2011 March 01.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 5

u(t)= argmin

u

∑−1(F − M ∘{ϒ(t)∘u})

2+

1

σx

2dist(ϒ(t), {ϒ(t)∘u}),

(1)

where u is any admissible transformation. Compute Γ(t) = ϒ(t) ∘ u(t).

Step 2. Given Γ(t),

Minimize the last two terms of Eq. (3):

ϒ(t+1)

= argmin

ϒ

1

σx

2dist( ϒ , Γ(t)) +

1

σT

2Reg( ϒ ).

(2)

until convergence;

(3)

In this case, the fixed image F and warped moving image M ∘ Γ are treated as N × 1 vectors.

Typically, dist(ϒ, Γ) = ‖ϒ − Γ‖2, encouraging the resulting transformation Γ to be close to

the hidden transformation ϒ and Reg(ϒ) = ‖ ∇(ϒ − Id)‖2, i.e., the regularization penalizes the

gradient magnitude of the displacement field ϒ − Id of the hidden transformation ϒ. σx and

σt provide a tradeoff among the different terms of the objective function. Σ is typically a

diagonal matrix that models the variability of a feature at a particular voxel. It can be set

manually or estimated during the construction of an atlas.

This formulation facilitates a two-step optimization procedure that alternately optimizes the

first two (first and second) and last two (second and third) terms of Eq. (3). Starting from an

initial displacement field ϒ0, the Demons algorithm iteratively seeks an update

transformation to be composed with the current estimate, as summarized in Algorithm 1.

In the original Demons algorithm [57], the space of admissible warps includes all 3D

displacement fields, and the composition operator ∘ corresponds to the addition of

displacement fields. The resulting transformation might therefore be not invertible. In the

Diffeormorphic Demons algorithm [66], the update u is a diffeormorphism from ℝ3 to ℝ3

parameterized by a stationary velocity field υ⃗. Note that υ⃗ is a function that associates a

tangent vector with each point in ℝ3. Under certain mild smoothness conditions, a stationary

velocity field υ⃗ is related to a diffeomorphism through the exponential mapping u = exp(υ⃗).

In this case, the stationary ODE ∂ x(t)/∂ t = υ⃗(x(t)) with the initial condition x(0) ∈ ℝ3 yields

exp(υ⃗) as a solution at time 1, i.e., x(1) = exp(υ⃗)(x(0)) ∈ ℝ3. In this case, exp(υ⃗)(x(0)) maps

point x(0) to point x(1).

The Demons algorithm and its variants are fast because for certain forms of dist(ϒ, Γ) and

Reg(ϒ), Step 1 reduces to a non-linear least-squares problem that can be efficiently

minimized via Gauss-Newton optimization and Step 2 can be solved by a single convolution

of the displacement field Γ with a smoothing kernel. The proof of the reduction of Step 2 to

a smoothing operation is illuminating and holds for dist(ϒ, Γ) = ‖ϒ − Γ‖2 and any Sobolev

norm Reg(ϒ) = Σi σi‖ ∇i(ϒ − Id)‖2 [14], [45]. In practice, a Gaussian filter is used without

consideration of the actual induced norm [14], [66]. The proof uses Fourier transforms and

is therefore specific to the Euclidean domain. Due to differences between the geometry of

the sphere and Euclidean spaces, we will see in Section III-D that the reduction of Step 2 to

a smoothing operation is only an approximation on the sphere.

Yeo et al.

Page 5

IEEE Trans Med Imaging. Author manuscript; available in PMC 2011 March 01.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 6

III. Spherical Demons

In this section, we demonstrate suitable choices of dist(ϒ, Γ) and Reg(ϒ) that lead to

efficient optimization of the modified Demons objective function in Eq. (3) on the unit

sphere S2. We construct updates u as diffeomorphisms from S2 to S2 parameterized by a

stationary velocity field υ⃗. We emphasize that unlike Diffeomorphic Demons [66], υ⃗ is a

tangent vector field on the sphere and not an arbitrary 3D vector field. A glossary of

common terms used throughout the paper is found in Table I.

A. Choice of dist(ϒ ϒ, Γ)

Suppose the transformations Γ and ϒ map a point xn ∈ S2 to two different points Γ(xn) ∈ S2

and ϒ(xn) ∈ S2 respectively. An intuitive notion of distance between Γ(xn) and ϒ(xn) would

be the geodesic distance between Γ(xn) and ϒ (xn). Therefore, we could define

geodesic(ϒ(xn), Γ (xn)). For reasons that will become clear in Section III-

D, we prefer to define dist(ϒ, Γ) in terms of a tangent vector representation of the

transformations Γ and ϒ, illustrated in Fig. 1, where the length of the tangent vector encodes

the amount of deformation.

Let Txn S2 be the tangent space at xn. We define Γ⃗n ∈ Txn S2 to be the tangent vector at xn

pointing along the great circle connecting xn to Γ(xn). In this work, we set the length of Γ⃗n to

be equal to the sine of the angle between xn and Γ(xn). With this particular choice of length,

there is a one-to-one correspondence between Γ(xn) and Γ⃗n, assuming the angle between xn

and Γ(xn) is less than π/2, which is a reasonable assumption even for relatively large

deformations. The choice of this length leads to a compact representation of Γ⃗n via vector

products. We define Gn to be the 3 × 3 skew-symmetric matrix representing the cross-

product of xn with any vector:

(4)

where xn(i) is the i-th coordinate of xn. Thus, xn × Γ(xn) = GnΓ(xn). Then on a unit sphere,

we obtain

(5)

A more intuitive choice for the length of Γ⃗n might be the geodesic distance between xn and

Γ(xn). If we restrict Γ⃗n to be at most length π, there is a one-to-one mapping between this

choice of the tangent vector Γ⃗n and the resulting transformation Γ(xn). Indeed, such a choice

of a tangent vector corresponds to an exponential map of S2 [39]. The resulting expression

for

paper, for simplicity, we follow the definition in Eq. (5).

is feasible, but more complicated than Eq. (5). In this

Given N vertices , the set of transformed points – or equivalently the

tangent vectors

transformation Γ everywhere on S2. Similarly, we can define the transformation ϒ or the

– together with a choice of an interpolation function define the

equivalent tangent vector field ϒ⃗ through a set of N tangent vectors . We emphasize

Yeo et al.

Page 6

IEEE Trans Med Imaging. Author manuscript; available in PMC 2011 March 01.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 7

that these tangent vector fields are simply a convenient representation of the transformations

ϒ and Γ and should not be confused with the stationary velocity field υ⃗ that will be used

later on. We now set

(6)

which is well-defined since both Γ⃗n and ϒ⃗n belong to Txn S2 for each n = 1,…, N.

B. Spherical Demons Step 1

In this section, we show that the update for Step 1 of the Spherical Demons algorithm can be

computed independently for each vertex. With our choice of dist(ϒ, Γ), step 1 of the

algorithm becomes a minimization with respect to the velocity field .

By substituting u = exp(υ⃗) and into Eq. (1), we obtain

(7)

(8)

(9)

where is the n-th diagonal entry of Σ and ∘ denotes warp composition.

Defining Coordinate Charts on the Sphere

The cost function in Eq. (9) is a mapping from the tangent bundle TS2 to the real numbers

ℝ. We can think of each tangent vector υ⃗n as a 3 × 1 vector in ℝ3 tangent to the sphere at xn.

Therefore υ⃗n has 2 degrees of freedom and Eq. (9) represents a constrained optimization

problem. Instead of dealing with the constraints, we employ coordinate charts that are

diffeomorphisms (smooth and invertible mappings) between open sets in R2 and open sets

on S2. The differential of the coordinate chart establishes correspondences between the

tangent bundles Tℝ2 and TS2 [39], [44], so we can reparameterize the constrained

optimization problem into an unconstrained one in terms of Tℝ2 (see Fig. 2).

It is a well-known fact in differential geometry that covering S2 requires at least two

coordinate charts. Since the tools of differential geometry are coordinate-free [39], [44], our

results are independent of the choice of the coordinate charts. Let e⃗n1, e⃗n2 be any two

orthonormal 3 × 1 vectors tangent to the sphere at xn, where orthonormality is defined via

the usual Euclidean inner product in 3D. In this work, for each mesh vertex xn, we define a

local coordinate chart Ψn: ℝ2 ↦ S2,

Yeo et al.

Page 7

IEEE Trans Med Imaging. Author manuscript; available in PMC 2011 March 01.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 8

(10)

As illustrated in Fig. 2, Ψn(0) = xn. Let z⃗n be a 2 × 1 tangent vector at the origin of ℝ2. With

the choice of the coordinate chart above, the corresponding tangent vector at xn is given by

the differential of the mapping DΨn(·) evaluated at x′ = 0:

(11)

(12)

(13)

(14)

The above equation defines the mapping of a tangent vector z⃗n at the origin of ℝ2 to the

tangent vector υ⃗n at xn via the differential of the coordinate chart DΨn at x′ = 0. We note

that for a tangent vector at an arbitrary point in ℝ2, the expression for the corresponding

tangent vector on the sphere is more complicated. This motivates our definition of a separate

chart for each mesh vertex, to simplify the derivations.

Gauss-Newton Step of Spherical Demons

From Eq. (14), we obtain exp(υ⃗) = exp({υ⃗n}) = exp({Enz⃗n}) and rewrite Eq. (9) as an

unconstrained optimization problem:

(15)

(16)

This non-linear least-squares form can be optimized efficiently with the Gauss-Newton

method, which requires finding the gradient of both terms with respect to {z⃗n} at {z⃗n = 0}

and solving a linearized least-squares problem.

We let

be the 1 × 3 spatial gradient of the warped moving image M ∘ ϒ(t)(·) at xn and

is tangent to the sphere at xn. The computation of

A-A. Defining un ≜ exp({Enz⃗n})(xn), we differentiate the first term of the cost function fn(z⃗)

in Eq. (15) using the chain rule, resulting in the 1 × 2 vector:

note that is discussed in Appendix

Yeo et al.

Page 8

IEEE Trans Med Imaging. Author manuscript; available in PMC 2011 March 01.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 9

(17)

(18)

(19)

(20)

where δ(k, n) = 1 if k = n and 0 otherwise. Eq. (20) uses the fact that the differential of

exp(υ⃗) at υ⃗ = 0 is the identity [47], i.e, [D exp(0)] υ⃗ = υ⃗. In other words, a change in

velocity υ⃗k. at vertex xk does not affect exp(υ⃗)(xn) for n ≠ k up to the first order derivatives.

Similarly, we define

discussed in Appendix A-B. Differentiating the second term of the cost function gn(z⃗) in Eq.

(15) using the chain rule, we get the 3 × 2 matrix:

to be the 3 × 3 Jacobian of ϒ(t)(·) at xn. The computation of is

(21)

where Gn is the skew-symmetric matrix defined in Eq. (4).

Once the derivatives are known, we can compute the corresponding gradients based on our

choice of the metric of vector fields on S2. In this work, we assume an l2 inner product, so

that the inner product of vector fields is equal to the sum of the inner products of the

individual vectors. The inner product of individual vectors is in turn specified by the choice

of the Riemannian metric on S2. Assuming the canonical metric, so that the inner product of

two tangent vectors is the usual inner product in the Euclidean space [39], the gradients are

equal to the transpose of the derivatives Eqs. (20), (21) (see Appendix A-C). We can then

rewrite Eq. (15) as a linearized least-squares objective function:

(22)

Yeo et al.Page 9

IEEE Trans Med Imaging. Author manuscript; available in PMC 2011 March 01.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 10

(23)

(24)

Because of the delta function δ(k, n) in the derivatives in Eqs. (20), (21), z⃗n only appears in

the n-th term of the cost function Eq. (24). The solution of Eq. (24) can therefore be

computed independently for each z⃗n. Solving this linear least-squares equation yields an

update rule for z⃗n:

(25)

For each vertex, we only need to perform matrix-vector multiplication of up to 3 × 3

matrices and matrix inversion of 2 × 2 matrices. This implies the update rule for v⃗n:

(26)

(27)

In practice, we use the Levenberg-Marquardt modification of Gauss-Newton optimization

[49] to ensure matrix invertibility:

(28)

where ε is a regularization constant. We note that in the classical Euclidean Demons [57],

[14], the term

utilizing Levenberg-Marquardt optimization. Once again, we emphasize that a different

choice of the coordinate charts will lead to the same update.

turns out to be the identity, so it can also be seen as

Given

composed with the current transformation estimate ϒ(t) to form Γ(t) = ϒ(t) ∘ exp(υ⃗(t)).

Appendix D discusses implementation details of extending the “scaling and squaring”

procedure in Euclidean spaces to S2.

, we use “scaling and squaring” to compute exp(υ⃗(t)) [3], which is then

Yeo et al.

Page 10

IEEE Trans Med Imaging. Author manuscript; available in PMC 2011 March 01.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 11

C. Choice of Reg(ϒ ϒ)

We now define the Reg(ϒ) term using the corresponding tangent vector field representation

ϒ⃗. Following the work of [31], [61], we let H be the Hilbert space of square integrable

vector fields on the sphere defined by the inner product:

(29)

where u⃗1, u⃗2 ∈ H and 〈·,·〉R refers to the canonical metric. Because vector fields from H are

not necessarily smooth, we restrict the deformation ϒ⃗ to belong to the Hilbert space V ⊂ H

of vector fields obtained by the closure of the space of smooth vector fields on S2 via a

choice of the so-called energetic inner product denoted by

(30)

where L could for example be the Laplacian operator on smooth vector fields on S2 [31],

[61].

We define Reg(ϒ) ≜ ‖ϒ⃗‖V. With a proper choice of the energetic inner product (e.g.,

Laplacian), a smaller value of ‖ϒ⃗‖V corresponds to a smoother vector field and thus

smoother transformation ϒ. As we will see later in this section, the exact choice of the inner

product is unimportant in our implementation.

D. Optimizing Step 2 of Spherical Demons

With our choice of dist(ϒ, Γ) in Section III-A and Reg(ϒ) in Section III-C, the optimization

in Step 2 of the Spherical Demons algorithm

(31)

seeks a smooth vector field ϒ⃗ ∈ V that approximates the tangent vectors

problem corresponds to the inexact vector spline interpolation problem solved in [31],

motivating our use of tangent vectors in the definition of dist(ϒ, Γ) in Section III-A, instead

of the more intuitive choice of geodesic distance.

. This

We can express the tangent vectors Γ⃗n and ϒ⃗n as EnΓn and Enϒn respectively. Essentially,

this represents Γ⃗n and ϒ⃗n in terms of the tangent space basis En at xn, where Γn and ϒn are

the components of the tangent vectors with respect to this basis. Γ̂ and ϒ̂ be 2N × 1 vectors

corresponding to stacking Γn and ϒn respectively. The particular optimization formulated in

Eq. (31) has a unique optimum [31], given by

(32)

where K is a 2N × 2N matrix consisting of N × N blocks of 2 × 2 matrices: the (i,j) block

corresponds to k(xi,xj)Txi,xj. The 2 × 2 linear transformation Txi,xj(·) parallel transports a

tangent vector along the great circle from TxiS2 to TxjS2. k(xi,xj) is a non-negative scalar

function uniquely determined by the choice of the energetic norm. Typically, k(xi,xj)

Yeo et al.

Page 11

IEEE Trans Med Imaging. Author manuscript; available in PMC 2011 March 01.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 12

monotonically decreases as a function of the distance between xi and xj. The proof of the

uniqueness of the global optimum and the form of solution in Eq. (32) follow from the fact

that the Hilbert space V is a reproducing kernel hilbert space (RKHS), allowing the

exploitation of the Riesz representation theorem [31]. This offers a wide range of choices of

regularization depending on the choice of the energetic norm and the corresponding RKHS.

In [31], the spherical vector spline interpolation problem was applied to landmark matching

on S2, resulting in a reasonable sized linear system of equations. Solving the matrix

inversion shown in Eq. (32) is unfortunately prohibitive for cortical surfaces with more than

100,000 vertices. If one chooses a relatively wide kernel k(xi,xj), the system is not even

sparse.

Inspired by the convolution method of optimizing Step 2 in the Demons algorithm [14],

[57], [66] and the convolution-based fast fluid registration in the Euclidean space [12], we

propose an iterative smoothing approximation to the solution of the spherical vector spline

interpolation problem.

In each smoothing iteration, for each vertex xi, tangent vectors of neighboring vertices xj are

parallel transported to xi and linearly combined with the tangent vector at xi. The weights for

the linear combination are set to

for i ≠ j, where |Ni| is the number of neighboring vertices of xi. Therefore, larger number of

iterations m and values of γ results in greater amount of smoothing.

and

We note that the iterative smoothing approximation to spline interpolation is not exact

because parallel transport is not transitive on S2 due to the non-flat curvature of S2 (unlike in

Euclidean space), i.e., parallel transporting a tangent vector from point a to b to c results in a

vector different from the result of parallel transporting a tangent vector from a to c.

Furthermore, the approximation accuracy degrades as the distribution of points becomes less

uniform. In Appendix B, we provide a theoretical bound on the approximation error and

demonstrate empirically that iterative smoothing provides a good approximation of spherical

vector spline interpolation for a relatively uniform distribution of points corresponding to

those of the subdivided icosahedron meshes used in this work.

E. Remarks

The Spherical Demons algorithm is summarized in Algorithm 2.

We run the Spherical Demons algorithm in a multi-scale fashion on a subdivided icosahedral

mesh. We begin from a subdivided icosahedral mesh (ic4) that contains 2,562 vertices and

work up to a subdivided icosahedral mesh (ic7) that contains 163,842 vertices, which is

roughly equal to the number of vertices in the cortical meshes we work with. We perform 15

iterations of Step 1 and Step 2 at each level. Because of the fast convergence rate of the

Gauss-Newton method, we find that 15 iterations are more than sufficient for our purposes.

We also perform a rotational registration at the beginning of each multi-scale level via a

sectioned search of the three Euler angles.

Empirically, we find the computation time of the Spherical Demons algorithm is roughly

divided equally among the four components: registration by rotation, computing the Gauss-

Newton update, performing “scaling and squaring” and smoothing the vector field.

Yeo et al.

Page 12

IEEE Trans Med Imaging. Author manuscript; available in PMC 2011 March 01.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 13

Algorithm 2

Spherical Demons Algorithm

Data: A fixed spherical image F and moving spherical image M.

Result: Diffeomorphism Γ so that M ∘ Γ is “close” to F.

Set ϒ0 = identity transformation (or some a-priori transformation, e.g., from a previous registration)

repeat

Step 1. Given ϒ(t),

foreach vertex n do

Compute using Eq. (28).

end

Compute Γ(t) = exp(v̄) using “scaling and squaring”.

Step 2. Given Γ(t),

foreach vertex n do

Compute using Eq. (48) implemented via iterative smoothing.

end

until convergence;

In practice, we work with spheres that are normalized to be of radius 100, because we find

that at ic7, the average edge length of 1mm corresponds to that of the original cortical

surface meshes. This allows for meaningful interpretation of distances on the sphere. This

requires slight modification of the equations presented previously to keep track of the radius

of the sphere.

The Spherical Demons algorithm presented here registers pairs of spherical images. To

incorporate a probabilistic atlas defined by a mean image and a standard deviation image,

we modify the Demons objective function in Eq. (3), as explained in Appendix C. This

requires a choice of warping the subject or warping the atlas. We find that interpolating the

atlas gives slightly better results, compared with interpolating the subject. However,

interpolating the subject results in a runtime of under 3 minutes, while the runtime for

interpolating the atlas is less than 5 minutes. In the next section, we report results for

interpolating the atlas.

IV. Experiments

We use two sets of experiments to evaluate the performance of the Spherical Demons

algorithm by comparing it to the widely used and freely available FreeSurfer [27] software.

The FreeSurfer registration algorithm uses the same similarity measure as Demons, but

explicitly penalizes for metric and areal distortion. As we will show, even though the

Spherical Demons algorithm does not specifically take into account the original metric

properties of the cortical surface, we still achieve comparable if not better registration

accuracy than FreeSurfer. Furthermore, FreeSurfer runtime is more than an hour while

Spherical Demons runtime is less than 5 minutes.

There are four parameters in the algorithm.

and ε decrease the size of the update taken in Step 1 of the Spherical Demons

algorithm. In the experiments that follow, we set

and ε appear in Eq. (28). Larger values of

and set their values such that the

Yeo et al.

Page 13

IEEE Trans Med Imaging. Author manuscript; available in PMC 2011 March 01.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 14

largest vector of the update velocity field is roughly two times the edge lengths of the mesh.

The number of iterations m and the weight

We set γ = 1 and explore a range of smoothing iterations m in the following experiments.

determine the degree of smoothing.

A. Parcellation of In-Vivo Cortical Surfaces

We validate Spherical Demons in the context of automatic cortical parcellation. Automatic

labeling of cortical brain surfaces is important for identifying regions of interests for

clinical, functional and structural studies [20], [52]. Recent efforts have ranged from the

identification of sulcal/gyral ridge lines [56], [62] to the segmentation of sulcal/gyral basins

[20], [28], [38], [41], [42], [51], [52], [67]. Similar to these prior studies, we are interested in

parcellation of the entire cortical surface meshes, where each vertex is assigned a label.

We consider a set of 39 left and right cortical surface models extracted from in-vivo MRI

[19]. Each surface is spherically parameterized and represented as a spherical image with

geometric features at each vertex: mean curvature of the cortical surfaces, mean curvature of

the inflated cortical surfaces and average convexity of the cortical surfaces, which roughly

corresponds to sulcal depth [26]. These features are intrinsic and thus independent of the

parameterization of the surface. The tools used for segmentation [19] and spherical

parameterization [26] are freely available [29]. Both hemispheres of each subject were

manually parcellated by a neuroanatomist into 35 labels, corresponding to the main sulci and

gyri, enumerated in Table II.

We co-register all 39 spherical images of cortical geometry with Spherical Demons by

iteratively building an atlas and registering the surfaces to the atlas. The atlas consists of the

mean and variance of cortical geometry represented by the surface features described above.

We then perform 4-fold cross-validation of the parcellation of the co-registered cortical

surfaces. In each iteration of cross-validation, we leave out ten subjects and use the

remainder of the subjects to train a classifier [20], [28] that predicts the labels based on

location and geometric features. We then apply the classifier to the hold-out set of ten

subjects. We perform each iteration with a different hold-out set, i.e., subjects 1-10, 11-20,

21-30 and 31-39.

As mentioned previously, increasing the number of iterations of smoothing results in

smoother warps. As discussed in [67], the choice of the tradeoff between the similarity

measure and regularization is important for segmentation accuracy. Estimating the optimal

registration regularization tradeoff is an active area of research [1], [18], [48], [65], [67],

[68] that we do not deal with in this paper. Here, we simply repeat the above experiments

using {6, 8, 10, 12, 14} iterations of smoothing. For brevity, we will focus the discussion on

using 10 iterations of smoothing and comment on results obtained with the other levels of

smoothing.

We repeat the above procedure of performing co-registration and cross-validation with the

FreeSurfer registration algorithm [27] using the default FreeSurfer settings. Once again, we

use the same features and parcellation algorithm [20], [28]. As before, the atlas consists of

the mean and variance of cortical geometry.

To compare the cortical parcellation results, we compute the average Dice measure, defined

as the ratio of cortical surface area with correct labels to the total surface area averaged over

the test set. Because the average Dice can be misleading by suppressing small structures, we

also compute the Dice measure for each structure.

Yeo et al.

Page 14

IEEE Trans Med Imaging. Author manuscript; available in PMC 2011 March 01.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 15

On the left hemisphere, FreeSurfer achieves an average Dice of 88.9, while Spherical

Demons achieves an average Dice of 89.6 with 10 iterations of smoothing. While the

improvement is not big, the difference is statistically significant for a onesided t-test with the

Dice measure of each subject treated as an independent sample (p = 2 × 10−6). Furthermore,

the overall Dice is statistically significantly better than FreeSurfer for all levels of smoothing

we considered, with the best overal dice achieved with 12 iterations of smoothing.

On the right hemisphere, FreeSurfer obtains a Dice of 88.8 and Spherical Demons achieves

89.1 with 10 iterations of smoothing. Here, the improvement is smaller, but still statistically

significant (p = 0.01). Furthermore, the overall dice is statistically significantly better than

FreeSurfer for all levels of smoothing we considered, except when 6 iterations of smoothing

is used (p = 0.06). All results we report in the remainder of this section use 10 iterations of

smoothing.

We analyze the segmentation accuracy separately for each structure. To compare Spherical

Demons with FreeSurfer, we perform a one-sided paired-sampled t-test treating each subject

as an independent sample and correct for multiple comparisons using a False Discovery Rate

(FDR) of 0.05 [10]. On the left (right) hemisphere, the segmentations of 16 (8) structures are

statistically significantly improved by Spherical Demons with respect to FreeSurfer, while

no structure is significantly worse.

Fig. 3 shows the percentage improvement of individual structures over FreeSurfer. Fig. 4

displays the average Dice per structure for FreeSurfer and Spherical Demons (10 iterations

of smoothing) for the left and right hemispheres. Standard errors of the mean are displayed

as red bars. The numbering of the structures correspond to Table II. Parcellation

improvements suggest that our registration is at least as accurate as FreeSurfer.

The structures with the worst Dice are the frontal pole and entorhinal cortex. These

structures are small and relatively poorly defined by the underlying cortical geometry. For

example, the entorhinal cortex is partially defined by the rhinal sulcus, a tiny sulcus that is

only visible on the pial surface. The frontal pole is defined by the surrounding structures,

rather than by the underlying cortical geometry.

B. Brodmann Area Localization on Ex-vivo Cortical Surfaces

Brodmann areas are cyto-architectonically defined parcellations of the cerebral cortex [13].

They can be observed through histology and more recently, through ex-vivo high resolution

MRI [6]. Unfortunately, much of the cytoarchitectonics cannot be observed with current in-

vivo imaging. Nevertheless, most studies today report their functional findings with respect

to Brodmann areas, usually estimated by visual comparison of cortical folds with

Brodmann's original drawings without quantitative analysis of local accuracy. By combining

histology and MRI, recent methods for creating probabilistic Brodmann area maps in the

Talairach and Colin27 normalized space promise a more principled approach [2], [24], [54],

[55], [71].

In this experiment, we consider a data set that contains Brodmann labels mapped to the

corresponding MRI volume. Specifically, we work with postmortem histological images of

ten brains created using the techniques described in [54], [71]. The histological sections

were aligned to postmortem MR with nonlinear warps to build a 3D histological volume.

These volumes were segmented to separate white matter from other tissue classes, and the

segmentation was used to generate topologically correct and geometrically accurate surface

representations of the cerebral cortex using FreeSurfer [19]. The eight manually labeled

Brodmann area maps (areas 2, 4a, 4p, 6, 44, 45, 17 and 18) were sampled onto the surface

representations of each hemisphere, and errors in this sampling were manually corrected

Yeo et al.

Page 15

IEEE Trans Med Imaging. Author manuscript; available in PMC 2011 March 01.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript