PreprintPDF Available

A Steerable Deep Network for Model-Free Diffusion MRI Registration

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

Nonrigid registration is vital to medical image analysis but remains challenging for diffusion MRI (dMRI) due to its high-dimensional, orientation-dependent nature. While classical methods are accurate, they are computationally demanding, and deep neural networks, though efficient, have been underexplored for nonrigid dMRI registration compared to structural imaging. We present a novel, deep learning framework for model-free, nonrigid registration of raw diffusion MRI data that does not require explicit reorientation. Unlike previous methods relying on derived representations such as diffusion tensors or fiber orientation distribution functions, in our approach, we formulate the registration as an equivariant diffeomorphism of position-and-orientation space. Central to our method is an SE(3)\mathsf{SE}(3)-equivariant UNet that generates velocity fields while preserving the geometric properties of a raw dMRI's domain. We introduce a new loss function based on the maximum mean discrepancy in Fourier space, implicitly matching ensemble average propagators across images. Experimental results on Human Connectome Project dMRI data demonstrate competitive performance compared to state-of-the-art approaches, with the added advantage of bypassing the overhead for estimating derived representations. This work establishes a foundation for data-driven, geometry-aware dMRI registration directly in the acquisition space.
Content may be subject to copyright.
arXiv:2501.04794v1 [eess.IV] 8 Jan 2025
A Steerable Deep Network for Model-Free
Diffusion MRI Registration
Gianfranco Cortés and Baba C. Vemuri
Department of CISE, University of Florida, Gainesville, FL, USA
{gcortes, vemuri}@ufl.edu
Abstract. Nonrigid registration is vital to medical image analysis but
remains challenging for diffusion MRI (dMRI) due to its high-dimensional,
orientation-dependent nature. While classical methods are accurate, they
are computationally demanding, and deep neural networks, though effi-
cient, have been underexplored for nonrigid dMRI registration compared
to structural imaging. We present a novel, deep learning framework for
model-free, nonrigid registration of raw diffusion MRI data that does not
require explicit reorientation. Unlike previous methods relying on derived
representations such as diffusion tensors or fiber orientation distribution
functions, in our approach, we formulate the registration as an equiv-
ariant diffeomorphism of position-and-orientation space. Central to our
method is an SE(3)-equivariant UNet that generates velocity fields while
preserving the geometric properties of a raw dMRI’s domain. We intro-
duce a new loss function based on the maximum mean discrepancy in
Fourier space, implicitly matching ensemble average propagators across
images. Experimental results on Human Connectome Project dMRI data
demonstrate competitive performance compared to state-of-the-art ap-
proaches, with the added advantage of bypassing the overhead for esti-
mating derived representations. This work establishes a foundation for
data-driven, geometry-aware dMRI registration directly in the acquisi-
tion space.
Keywords: Diffusion MRI · Steerable CNN · Special Euclidean Group
· Nonrigid Registration · RKHS
1 Introduction
Computing a nonrigid deformation mapping one image, Sm, to another, Sf,
is an essential task in medical image analyses, most notably for cross-subject
comparisons, population-specific atlas construction, and atlas-based segmenta-
tion. This task, known as image registration, possesses well-established classical
[12,44,2,43,27] and deep learning [3,14,15] solutions that are readily applicable
to 3D scalar-valued modalities of the form R3R, such as structural magnetic
resonance imaging (sMRI) and computed tomography (CT). However, registra-
tion of diffusion MRI (dMRI), an imaging modality primarily used to probe
neural microstructure, requires additional care.
2 G. Cortés, B.C. Vemuri
Within each voxel, diffusion-weighted imaging captures the diffusion profile
of water molecules, which is naturally constrained (or not) by the surrounding
tissue. The diffusion profile at a position pR3is characterized by the ensem-
ble average propagator (EAP), which is a probability density Pp(r) : R3R0
describing the probability of a water molecule being displaced by rwithin the ef-
fective diffusion time. Practitioners ultimately care about resolving the dominant
directions of diffusion at each voxel in order to perform downstream analysis like
tractography [28,21,10]. As a means to this end, assumptions are imposed on the
unknown EAP (e.g. Gaussian) in order to fit a simplified model (e.g. a diffusion
tensor) to the raw signal, with the hope that the principal directions of diffu-
sion are accurately captured. We refer to such models as derived representations,
examples of which include diffusion tensors (DTs) [4,45,41], fiber orientation dis-
tribution functions (fODFs) [39,38,24], orientation distribution functions [55,16],
Gaussian mixtures [25], and Hermite basis functions [30].
It follows that diffusion-weighted images and their derived representations
carry orientation-dependent information at each voxel that must be properly
transformed (i.e. reoriented) under a spatial deformation Φ:R3R3. While
the details of how this reorientation is carried out will depend on the derived
representation and domain-specific assumptions (e.g. finite strain), it is always
grounded in the fact that if pis displaced to Φ(p), then Sm(p)is transformed
according to some action of the Jacobian dΦp. The literature is filled with ap-
proaches to register both raw dMRI [18,53] and derived representations [1,52,11,33,32]
that extend the scalar-valued setting to include this required reorientation step.
These methods adhere to the classical image registration formulation where a
pair of images is iteratively aligned by minimizing a dissimilarity, as opposed to
methods that exploit deep learning.
While various deep learning methodologies [50,3,14,15] have been developed
to speedily handle the 3D, scalar-valued registration problem at a fraction of
classical runtimes, we are only aware of two data-driven frameworks [51,6] that
are designed to address the diffusion-weighted setting. Both of these methods
rely on VoxelMorph (VM)-inspired [3] backbones consisting of a UNet [35] that
estimates a nonlinear deformation Φand a spatial transformer [22] that applies Φ
to the moving image Smand handles reorientation. The key distinction between
them is their choice of input features. The first of these, DDMReg [51], requires
a fractional anisotropy (FA) map and tract orientation maps (TOMs) for each
diffusion-weighted image. Note that since these are scalar- and vector-valued
features, respectively, they can be fed to a vanilla VM backbone as is. On the
other hand, the method of [6], which we will refer to as MVCReg, requires square
root density parameterized fODF maps as input. Since these are manifold-valued
derived representations, the authors are forced to replace the VM backbone’s
vanilla convolutions with manifold-valued convolutions [9,5].
In this work, we demonstrate that data-driven registration should and can
be performed on the raw dMRI data. Thus our approach is “model-free” in this
sense. To facilitate the discussion, we refer to the scalar-valued setting as the
anat-pregistration problem, and its aforementioned extension to the diffusion-
A Steerable Deep Network for Model-Free Diffusion MRI Registration 3
Type anat-pdiff-pdiff-pq
Modalities sMRI, CT, FA DT, fODF, GMM raw dMRI
Mechanism
Driving position position orientation
position
Schematic
Registration R3R
R3R
Sf
Φ
Sm
R3(R3R)
R3(R3R)
Sf
Φ
Sm
reorient
R3R3R
R3R3R
Sf
Φ
Sm
Fig. 1. Comparison of three registration scenarios in which we try to estimate a defor-
mation Φthat minimizes L(Sf, SmΦ)for some dissimilarity L.
weighted regime as the diff-pregistration problem (see Fig. 1). Although de-
rived representations offer the conceptual benefit of offloading the orientational
information to the codomain, we argue that their usage in registration suffers
from two pitfalls: (1) inherent limitations of the model chosen to approximate
the acquired data manifest themselves as lost information and (2) the orienta-
tional information is not directly leveraged to drive the registration. This second
point motivates our formulation of the diff-pq registration problem, in which
the orientational information is pulled back into the domain, thus allowing the
deformation Φto be a function of both position and orientation (see Fig. 1).
This change in perspective demands that we borrow tools from the field of
geometric deep learning (GDL) [7,48], with the aim of generating deformations
Φthat respect the geometry of the newly introduced non-Euclidean domain. We
introduce several novel contributions including (1) the diff-pq registration prob-
lem and its formalization, which in particular avoids explicit Jacobian estimation
and reorientation, (2) an SE(3)-equivariant, end-to-end, VM-inspired network
capable of preserving the geometry of a raw diffusion-weighted signal’s domain,
(3) use of maximum mean discrepancy (MMD) [36] in the Fourier space as a
loss function for dMRI registration, and (4) an experimental evaluation on HCP
dMRI scans comparing our method to SOTA dMRI registration approaches.
The rest of the paper is organized as follows: In Section 2, we briefly present
group-theoretic prerequisites before delving into the theoretical underpinnings of
our proposed registration framework. In Section 3, we present the construction
of the SE(3)-equivariant convolution layers and our VM-inspired architecture.
Section 4 contains experimental results and conclusions are drawn in Section 5.
2 Background
2.1 Representations and Equivariance
Definition 1. An action of a group Gon a space Mis a mapping (g, p)7→ gp
satisfying the following properties for all g, gGand pM: (1) ep=p
where eis the group identity and (2) g(gp) = (gg)p.
4 G. Cortés, B.C. Vemuri
Definition 2. Ad-dimensional representation of a group Gis a map ρ:G
GL(d)that satisfies ρ(gg) = ρ(g)ρ(g)for all g, gG.
Definition 3. Let Mand Nbe spaces with a group Gacting on each of them.
A function Ψ:MNis equivariant to Gif Ψ(gMp) = gNΨ(p)for all
gGand pM.
2.2 The Geometry of Diffusion-Weighted Images
To a first approximation, a diffusion-weighted acquisition sequence can be de-
scribed as follows: first, select a finite number of directions gjS2R3along
which diffusion-sensitized magnetic field gradients are applied; second, acquire
a3D volume for each gj. At first glance, this would suggest that a raw dMRI
is a function S:R3×S2R, since there is a 3D volume S(,g) : R3R
for every direction gS2. However, there are two additional idiosyncrasies of a
diffusion-weighted acquisition sequence that need to be introduced:
a) For a fixed direction g, we can scale the strength of the applied diffusion-
sensitized magnetic field gradient, where the strength is proportional to (the
square root of ) the so-called b-value. An important and required special case
is when b= 0.
b) For a fixed b-value, a dMRI exhibits antipodal symmetry, i.e. the 3D volume
acquired along gis equal to the 3D volume acquired along g.
By (a), there exists a shell” of directions for each b-value b0(where b= 0
corresponds to a degenerate shell), meaning we should augment our domain to
R3×(S2×R+)
| {z }
b>0
(R3× {0})
|{z }
b=0
R3×R3.(1)
By (b), a dMRI S:R3×R3Rmust satisfy
S(p,q) = S(p,q)for all p,qR3.(2)
This constraint is equivalent to saying that fis a function
R3×(RP2×R+)(R3× {0})R,(3)
where RP2is the real projective plane obtained by identifying antipodes on S2.
Nevertheless, for notational convenience, we will think of a dMRI as a function
R3
|{z}
p-space
×R3
|{z}
q-space
R(4)
satisfying Eqn. 2, where we deliberately split the domain into p-space and q-
space. The reasoning for this is twofold. From an acquisition perspective, p-space
and q-space are sampled differently, with p-space being sampled on a fixed, uni-
form Cartesian grid and q-space being sampled at q=0and on approximately
A Steerable Deep Network for Model-Free Diffusion MRI Registration 5
uniform spherical grids corresponding to a small number of selected b-values.
This is called a multi-shell acquisition, and a single-shell acquisition is the spe-
cial case where the signal is only sampled at one nonzero b-value. From a geo-
metric perspective, Eqns. 2 and 4 are still not enough to fully characterize the
diffusion-weighted signal, because R3×R3
=R6only describes the domain as
a set. In fact, p-space and q-space are physically coupled in a manner that can
only be captured by an action (see Def. 1) of SE(3) = R3SO(3), the group of
3D roto-translations. This action is given by
(t, r)(p,q) = (rp+t, rq)(5)
for all (t, r)SE(3) and (p,q)R3×R3. This entails that any dMRI Smust
appropriately transform under a roto-translation of the domain, i.e.
[(t, r)S](p,q) = S(r1(pt), r1p).(6)
To prevent ourselves from viewing the domain R3×R3as just a set in isolation,
we will follow [29] in using the notation R3R3as a reminder that p-space and
q-space are coupled in a manner less trivial than a direct product.
2.3 The diff-pq Registration Problem
We demonstrated in Section 2.2 that the natural domain for a raw diffusion-
weighted signal is R3R3, where the first factor denotes p-space and the sec-
ond factor denotes q-space. In the case of a derived representation, the q-space
is usually offloaded to the codomain, forcing us to formulate the deformation
of one derived representation into another (of same type) as a diff-pregistra-
tion problem. However, much akin to the anat-pregistration problem, we are
now dealing with scalar-valued functions whose alignment is simply a matter of
reparameterizing the domain via an appropriate diffeomorphism Φ. Hence, this
naturally gives way to the diff-pq registration problem, which we state here for
completeness: Given raw diffusion-weighted signals Sm, Sf:R3R3R, es-
timate a diffeomorphism Φ:R3R3R3R3that minimizes L(Sf, SmΦ)
for some dissimilarity L.
Notice that the notation R3R3carries weight in this setting, because
a diffeomorphism R3R3R3R3is not the same as a diffeomorphism
R3×R3R3×R3. In particular, a diffeomorphism Φ:R3R3R3R3
must commute with the SE(3) group action in Eqn. 6, a condition that is best
visualized with the following commutative diagram:
R3R3R3R3
R3R3R3R3
Φ
(t,r)(t,r)
Φ
(7)
The commutative diagram in Eqn. 7 is saying that both paths from the top-left
corner to the bottom-right corner are equivalent. Such a Φis called an equivariant
6 G. Cortés, B.C. Vemuri
diffeomorphism. Therefore, we require some kind of sufficient condition that will
guarantee that we can construct equivariant diffeomorphisms, which will ensure
that we do not violate the physical coupling of p- and q-space. Fortunately, we
can accomplish this by invoking the following theorem [46]:
Theorem 1. [46] Let Mbe a Riemannian manifold with a Lie group Gacting
on it. Let Xbe an equivariant vector field on M, i.e. Xgp =gXpfor all pM
and gG. Then, the flow generated by Xis an equivariant diffeomorphism.
Hence, it is sufficient for us to generate an equivariant vector (velocity) field,
which we may then integrate to obtain an equivariant diffeomorphism.
Before proceeding, we remind the reader that, in practice, q-space is sam-
pled on a small number of concentric spherical shells. Therefore, although the
diffusion-weighted signal does exist on all of R3R3, it is more prudent to re-
fine our treatment to the constituent shells and piece them back together when
needed. When we restrict ourselves to a single shell we will call the new domain
, which is R3×S2as a set (cf. Eqn. 1). The SE(3) group action (Eqn. 5) is
exactly the same as before, just restricted to this subspace.
2.4 Steerable Convolutions
Using convolutions to generate an equivariant velocity field on imposes two
conditions on the convolutions themselves: (1) the convolutions must be SE(3)-
equivariant and (2) the convolutions must be able to output vector fields on .
Although SE(3)-equivariant layers were developed in [26] to handle raw dMRI
signals, their method relies on the regular group representation which is incapable
of producing vector fields. Instead, we need to extend the steerable convolutions
introduced in [13,8], otherwise known as gauge equivariant convolutions, to the
domain . We now give a brief primer on steerability; see [48] for more detail.
Let Mbe a 5-dimensional Riemannian manifold with structure group SO(5),
the group of 5D rotations. In the steerable setting, feature maps become tensor
fields on M(e.g. vector order one tensor fields). A feature map is coordina-
tized w.r.t. a gauge, a local choice of coordinate frame. A steerable convolution
maps an input feature map fin of type ρin to an output feature map fout of
type ρout, where ρin and ρout are group representations that encode the trans-
formation behavior of the tensor components under a change of gauge [48]. Let
fin be a feature map of type ρin and K:R5Rdout ×din a matrix-valued filter
where din and dout are the dimensions of the input and output tensors, respec-
tively. Letting qv:= exppwpv(exp denotes the Riemannian exponential map
and wp:R5TpMa gauge), the convolved feature map fout =K fin is given
pointwise w.r.t. wpby
fout(p) := ZR5
K(v)ρin(tpqv)fin(qv)dv, (8)
where tpqvdenotes the SO(5)-valued gauge transformation taking the frame
on qv(after parallel transportstion to p) to the frame on p. Eqn. 8 is equivariant
A Steerable Deep Network for Model-Free Diffusion MRI Registration 7
to a change of gauge at pif and only if Kis SO(5)-steerable, i.e. Ksatisfies
K(t1v) = ρout(t1)K(v)ρin (t)(9)
for all tSO(5) and vR5[13]. A critical result of [48] shows that, in the case
where M=, steerable convolutions with SO(5)-steerable filters are equivariant
to 3D rototranslations.
2.5 An MMD Loss via Characteristic Functions
We argue that the ultimate goal of raw dMRI registration is not merely to align
the raw signals Sfand Sm, but rather to match their corresponding EAPs. As we
shall see in Section 4, this claim is evidenced by the fact that registration quality
in the dMRI setting is evaluated using white matter fiber bundle segmentations,
which are extracted using fODF peaks (an approximation of EAP peaks). The
relation between the raw signal Sand its associated EAP, P, is given by
Pp(r) = ZR3
e2πiqrE(p,q)dq,(10)
where E(p,q) = S(p,q)/S(p,0)is the signal attenuation. Therefore, if every
diffusion-weighted volume is normalized by the b= 0 volume, the resulting Eis
directly related to the EAP via a Fourier transform. Said differently, for every
position p,E(p,)is the characteristic function of the probability density Pp.
We now introduce a result from the theory of reproducing kernel Hilbert spaces
(RKHSs) that permits us to indirectly minimize the maximum mean discrepancy
(MMD) between EAPs via a modified L2loss applied to the signal attenuations.
Suppose Pand Qare two distributions on R3with corresponding character-
istic functions (Fourier transforms) φPand φQrespectively, and let kbe a kernel
function on R3, associated with the RKHS Hk. The MMD between Pand Qis
defined as
MMD(P, Q) = sup
f∈Hk,kfk≤1ZR3
fdPZR3
fdQ.(11)
Bochner’s theorem [36, Thm. 3] states that there exists a finite Borel measure
Λon R3such that
k(r,r) = ZR3
ei(rr)qdΛ(q).(12)
It was shown in [36] that
MMD(P, Q) = kφPφQkL2(R3),(13)
i.e. the MMD between Pand Qequals the L2distance between their character-
istic functions with respect to the measure Λ. In this work, we choose Λto be the
multivariate Gaussian distribution (measure) N(0, σ 2I3), whose corresponding
kernel kis exactly the Gaussian kernel k(r,r) = exp(σ2
2krrk2). In this case,
the L2loss between Pand Qequals
MMD(P, Q) = E|φP(X)φQ(X)|2, X N(0, σ2I3).(14)
8 G. Cortés, B.C. Vemuri
Letting E(p,q) := |Ef(p,q)[EmΦ](p,q)|2, this translates to the following
loss in our setting:
L(Ef, EmΦ) = CX
p,q
E(p,q)·e−kqk2/2σ2· kqk2+λX
p,q
kv(p,q)k2.(15)
Here, λis the regularization hyperparameter, vis the velocity field generated by
the UNet described in Section 3, and C= (2πσ2)3
2is a normalization constant.
Note that the guarantee provided by Eqn. 13 would not hold if we simply resorted
to a standard MSE loss, since a Lebesgue measure in isolation is not finite.
3 Implementation
3.1 Constructing SE(3)-Equivariant Convolution Layers
The most challenging obstacle to implementing the steerable convolutions of
Section 2.4 in practice is generating an SO(5)-steerable filter that satisfies Eqn.
9. Historically, this is done in one of two ways: (1) analytically solve for a basis
of the linear subspace of filters satisfying Eqn. 9, or (2) parameterize the con-
volution filter using an equivariant MLP. We opt for the second approach since
it circumvents the need to solve for an explicit basis, and it has shown superior
performance in equivariant tasks [54]. We can invoke the following lemma to
parameterize a filter that satisfies Eqn. 9 using an MLP:
Lemma 1. [54] If a filter Kis parameterized by an SO(5)-equivariant MLP
with input representation ρst and output representation ρ:= ρin ρout, then
the filter satisfies the steerability constraint in Equation 9.
In Lemma 1 above, ρst denotes the standard representation given by ρst(g) = g
and ρdenotes the tensor product representation given by ρ(g) = ρin (g)
ρout(g). We can construct an SO(5)-equivariant MLP matching the description
of Lemma 1 by using the open source method of [19]. To summarize briefly, the
authors of [19] efficiently build equivariant MLPs by decomposing the equivari-
ance constraint into a finite set of simpler constraints involving the Lie group’s
generators, which are elements of its Lie algebra. These constraints can be solved
efficiently at initialization time, and are thus a one-time computational cost.
3.2 Network Input
To prevent wasting model capacity on learning large, parameterizable motion
between images, we follow the preparatory step of affinely aligning a moving
image to the fixed image, as suggested in [6]. This is important for two reasons.
Firstly, nonlinear registration algorithms generally require a sensible initializa-
tion to perform well. Secondly, since VM-inspired registration makes the implicit
assumption that the moving and fixed images are roughly aligned in p-space be-
fore concatenation, we need to ensure that this assumption is also met in q-space
so that concatenation remains meaningful. Therefore, we take advantage of the
A Steerable Deep Network for Model-Free Diffusion MRI Registration 9
affine pre-alignment step to also resample the moving image’s q-space at the
fixed image’s q-vectors. The framework of [37,18] permits us to perform both
tasks in one step using angular interpolation (see Eqn. 19).
As motivated in Section 2.5, we then normalize the diffusion-weighted vol-
umes by the mean b= 0 volume, and thus we now call the moving and fixed
images Emand Ef, respectively. Finally, to address noise in the raw signal at-
tenuations, we apply a low-pass filter to the training data. For a fixed voxel,
the intensities on a given shell are expanded in terms of a truncated spherical
harmonic basis (max = 5). This yields Pmax
=0 (2+1) coefficients per shell. These
coefficients are then smoothed spatially across voxels using a Gaussian filter. The
signal attenuations are subsequently reconstructed from the smoothed spherical
harmonic coefficients, yielding a denoised representation suitable for training.
3.3 Network Architecture
In the spirit of DDMReg [51] and MVCReg [6], we continue the successful trend
of using a VM-inspired backbone to perform image registration, while also in-
corporating the crucial property of SE(3)-equivariance. The first module is the
UNet, which is responsible for generating a velocity field von the domain .
Our UNet has an encoder depth of 3. Each layer consists of an SE(3)-equivariant
convolution as described in Sections 2.4 and 3.1, followed by swish nonlineari-
ties [34] on scalar features and gated nonlinearities [49] on higher order features,
followed by e3nn-inspired batch normalization [20]. We interleave the encoding
layers with average pooling across p-space.
Next, inspired by [14], a scaling-and-squaring layer integrates the velocity
field vto apply a diffeomorphism Φ. We relate vto Φvia the differential equation
dΦ(t)
dt=v(Φ(t)), Φ(0) =id (16)
by stipulating that Φ=Φ(1). Hence, Φ= exp(v), and scaling-and-squaring is a
numerical technique that approximates this exponential map. This is done by
first scaling vby 2Nto create an incrementally small displacement Φ(0) (p,g) =
exp(2Nv(p,g)). Recall (p,g)(see Section 2.2). Then, the displacement is
squared iteratively Ntimes:
Φ(j+1)(p,g) = Φ(j)(Φ(j)(p,g)), j = 0,1,...,N 1.(17)
After Nsquaring steps, Φ(N)(p,g)approximates Φ(p,g) = exp(v(p,g)). We
set N= 4 in our implementation. We represent a tangent vector v(p,g)as a
6D vector whose first three components are the tangent vector vR3and whose
last three components are the tangent vector vS2(embedded in R3). Using this
representation, we have that
exp(v(p,g)) = p+vR3,cos(||vS2||)g+ sin(||vS2||)vS2
||vS2|| .(18)
10 G. Cortés, B.C. Vemuri
moving image ( )
velocity
field
integration
layer
fixed image ( )spatial
transform
warped ( )
loss ( )
Fig. 2. Overview of the proposed registration pipeline, parameterized by an SE(3)-
equivariant UNet gθ.
Finally, a spatial transformer module tailored to the manifold applies the
computed displacement Φto the moving image Em, yielding the warped image.
This is done by using trilinear interpolation in p-space and angular interpolation
in q-space [37]. Given the signal attenuation Esampled at orientations gj, we
interpolate the attenuation value at (p,g)as
E(p,g) = PjwjE(p,gj)
Pjwj
, wj=e(arccos(|ggj|))2/2σ2(19)
where each wjis a spherical radial basis function (RBF) centered at gjwith
Gaussian kernel and smoothness parameter σ. The absolute value comes from
Eqn. 2. We set σ= 0.1in our implementation. See Figure 2 for a schematic of
our network architecture.
4 Experiments
4.1 dMRI Registration Applied to HCP Data
Dataset We conducted training and evaluation using minimally preprocessed
dMRI data from the Human Connectome Pro ject (HCP) Young Adult dataset.
This dataset contains dMRI brain scans from 1,200 individuals aged 22–35. De-
tailed information about the acquisition parameters, subject selection criteria,
and preprocessing steps can be found in the original HCP study [42]. For our
analysis, we selected the same 400 subjects as in MVCReg to facilitate a more
direct performance comparison. Using MRtrix3 [40], 51 (25 validation subjects,
25 testing subjects, and 1 fixed subject) of these subjects underwent three-tissue
response function estimation [17] and multi-shell multi-tissue constrained spher-
ical deconvolution [23] to yield fODF maps. These were subsequently intensity
normalized and bias field corrected [31] before finally extracting the fODF peaks,
which are needed for model evaluation.
A Steerable Deep Network for Model-Free Diffusion MRI Registration 11
Table 1. Test time performance of various registration methods, average over all 25
test subjects and all 72 available white matter tracts.
Method Ours mrregister VoxelMorph DDMreg MVCReg MVVSReg
Modality raw dMRI fODF FA FA, TOM fODF fODF
Dice 0.7468 0.7601 0.7126 0.7417 0.7317 0.7493
Evaluation We evaluate registration quality by comparing the overlap of known
white matter tracts in the warped and fixed image. We measure this overlap using
the Dice score. To generate segmentations of white matter tracts, we use the well-
validated TractSeg segmentation model [47], which is capable of segmenting 72
distinct white matter tracts. The 51 peak maps produced above serve as input
to the TractSeg software. For each of the 25 validation subjects and the fixed
subject, we generated tract orientation maps (TOMs) for all 72 bundles. In
the diff-psetting, validation can be easily performed by applying the estimated
deformation Φ:R3R3to bundle segmentations of the moving image Smand
measuring the resulting overlap with bundle segmentations of the fixed image
Sf. However, in our diff-pq setting, it does not immediately make sense to warp
a bundle segmentation (a 3D volume) with a deformation . Therefore, we
are forced to predict our model’s test time behavior by monitoring a different
metric, namely the alignment of warped TOMs with the fixed image’s TOMs. At
test time, the 25 moving test subjects are warped by the model and subsequently
fed to TractSeg to generate bundle segmentation masks for all 72 bundles, which
are then compared with the fixed subject’s bundle segmentation masks.
We selected the same fixed image as in MVCReg [6], and all other subjects
are registered to this fixed image. Hence, there are 399 moving/fixed image pairs,
349 of which are used for training, 25 of which are used for validation, and 25 of
which are used for testing. Since this task is memory-intensive, we are forced to
downsample the spatial dimensions from 145 ×174 ×145 to 72 ×88 ×72.
We trained our model for 500 epochs and we used stochastic gradient descent
(SGD) with Nesterov momentum as an optimizer, which we found worked better
than Adam on our small batch size of 1. Our initial learning rate was 0.001 and
we used a learning rate scheduler that halves the learning rate every 100 epochs.
We compare our method against one classical approach (mrregister) and
four deep learning approaches (VoxelMorph, DDMReg, MVCReg, MVVSReg).
MRtrix3’s mrregister [33,32] is a classical method based on symmetric normal-
ization (SyN). We also include a VoxelMorph baseline trained on FA maps. Both
DDMReg and MVCReg are methods specifically designed for dMRI registration,
as discussed in the introduction. MVVSReg [6] is an extension of MVCReg that
uses second order, manifold-valued convolutions.
Results Quantitative and qualitative results on HCP dMRI registration are pre-
sented in Table 1 and Figure 3, respectively. Figure 3 uses checkerboard masks
in parts (d) and (e) to overlay alternating patches from two images, enabling
12 G. Cortés, B.C. Vemuri
Fig. 3. Visualization of registration results. Top row: b= 0 axial slices with overlaid
fODF maps. Bottom row: b= 0 coronal slices with overlaid fODF maps. (a) Moving
source image, (b) warped source image, (c) fixed target image, (d) checkerboard view
of moving vs. fixed image, (e) checkerboard view of warped vs. fixed image. The high-
lighted ROI shows a tract discontinuity in (d) that is resolved after warping in (e).
Note that SE(3)-equivariance preserves white matter fiber tracts during warping.
visual assessment of registration quality by highlighting alignment differences
between the moving and fixed images (d) or the warped and fixed images (e).
Our method outperforms all data-driven, SOTA methods in Dice score except
MVVSReg. We attribute this to the fact that MVVSReg utilizes higher or-
der convolutions that capture richer features, though at the expense of higher
computation-time/space (memory) complexities [6]. Furthermore, we emphasize
that none of the other methods are capable of registering the raw diffusion-
weighted data. Hence, they require additional offline overhead to estimate input
features. Overall, our method strikes a balance between competitive registration
quality and generality, offering a solution to practitioners who do not want to a
priori commit to a specific derived representation at this stage of their pipeline.
A Steerable Deep Network for Model-Free Diffusion MRI Registration 13
5 Conclusion
In this paper, we have introduced a novel framework for registering raw dMRI
signals that more directly leverages orientational information. We accomplish
this by constructing an SE(3)-equivariant UNet to generate velocity fields on
the raw signal domain, and by applying key theoretical results to ensure that
the physical coupling of p- and q-space is preserved. To our knowledge, we
are the first to present a data-driven technique that registers the raw dMRI
signals, as opposed to first computing some derived representation. Our HCP
dMRI registration experiment demonstrates that the proposed method achieves
competitive performance against state-of-the-art deep learning registration ap-
proaches. In future work, we aim to apply our proposed registration method to
dMRI scans from patient groups with neurodegenerative disorders.
References
1. Alexander, D.C., Pierpaoli, C., Basser, P.J., Gee, J.C.: Spatial transformations of
diffusion tensor magnetic resonance images. IEEE transactions on medical imaging
20(11), 1131–1139 (2001)
2. Avants, B.B., Epstein, C.L., Grossman, M., Gee, J.C.: Symmetric diffeomorphic
image registration with cross-correlation: evaluating automated labeling of elderly
and neurodegenerative brain. Medical image analysis 12(1), 26–41 (2008)
3. Balakrishnan, G., Zhao, A., Sabuncu, M.R., Guttag, J., Dalca, A.V.: Voxelmorph:
a learning framework for deformable medical image registration. IEEE transactions
on medical imaging 38(8), 1788–1800 (2019)
4. Basser, P.J., Mattiello, J., Lebihan, D.: Estimation of the effective self-diffusion
tensor from the nmr spin echo. Journal of Magnetic Resonance Ser. B 103(3),
247–254 (1994)
5. Bouza, J.J., Yang, C.H., Vaillancourt, D., Vemuri, B.C.: A higher order manifold-
valued convolutional neural network with applications to diffusion mri processing.
In: Information Processing in Medical Imaging: 27th International Conference,
IPMI 2021, Virtual Event, June 28–June 30, 2021, Proceedings 27. pp. 304–317.
Springer (2021)
6. Bouza, J.J., Yang, C.H., Vemuri, B.C.: Geometric deep learning for unsupervised
registration of diffusion magnetic resonance images. In: International Conference
on Information Processing in Medical Imaging. pp. 563–575. Springer (2023)
7. Bronstein, M.M., Bruna, J., LeCun, Y., Szlam, A., Vandergheynst, P.: Geometric
deep learning: going beyond euclidean data. IEEE Signal Processing Magazine
34(4), 18–42 (2017)
8. Cesa, G., Lang, L., Weiler, M.: A program to build E(N)-equivariant steerable
CNNs. In: International Conference on Learning Representations (ICLR) (2022),
https://openreview.net/forum?id=WE4qe9xlnQw
9. Chakraborty, R., Bouza, J., Manton, J.H., Vemuri, B.C.: Manifoldnet: A deep
neural network for manifold-valued data with applications. IEEE Transactions on
Pattern Analysis and Machine Intelligence 44(2), 799–810 (2020)
10. Cheng, G., Salehian, H., Forder, J., Vemuri, B.C.: Tractography from hardi using
an intrinsic unscented kalman filter. IEEE Transactions on Medical Imaging 35(1),
298–305 (2014)
14 G. Cortés, B.C. Vemuri
11. Cheng, G., Vemuri, B.C., Carney, P.R., Mareci, T.H.: Non-rigid registration of
high angular resolution diffusion images represented by gaussian mixture fields.
In: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2009:
12th International Conference, London, UK, September 20-24, 2009, Proceedings,
Part I 12. pp. 190–197. Springer (2009)
12. Christensen, G.E., Rabbitt, R.D., Miller, M.I.: Deformable templates using large
deformation kinematics. IEEE transactions on image processing 5(10), 1435–1447
(1996)
13. Cohen, T., Weiler, M., Kicanaoglu, B., Welling, M.: Gauge equivariant convo-
lutional networks and the icosahedral CNN. In: Proceedings of the 36th Inter-
national Conference on Machine Learning. Proceedings of Machine Learning Re-
search, vol. 97, pp. 1321–1330. PMLR (09–15 Jun 2019)
14. Dalca, A.V., Balakrishnan, G., Guttag, J., Sabuncu, M.R.: Unsupervised learning
of probabilistic diffeomorphic registration for images and surfaces. Medical image
analysis 57, 226–236 (2019)
15. Demir, B., Tian, L., Greer, H., Kwitt, R., Vialard, F.X., Estépar, R.S.J., Bouix,
S., Rushmore, R., Ebrahim, E., Niethammer, M.: Multigradicon: A foundation
model for multimodal medical image registration. In: International Workshop on
Biomedical Image Registration. pp. 3–18. Springer (2024)
16. Descoteaux, M., Deriche, R., Le Bihan, D., Mangin, J.F., Poupon, C.: Multiple q-
shell diffusion propagator imaging. Medical Image Analysis 15(4), 603–621 (2010)
17. Dhollander, T., Raffelt, D., Connelly, A.: Unsupervised 3-tissue response function
estimation from single-shell or multi-shell diffusion mr data without a co-registered
t1 image. In: ISMRM workshop on breaking the barriers of diffusion MRI. vol. 5.
Lisbon, Portugal (2016)
18. Duarte-Carvajalino, J.M., Sapiro, G., Harel, N., Lenglet, C.: A framework for linear
and non-linear registration of diffusion-weighted mris using angular interpolation.
Frontiers in neuroscience 7, 41 (2013)
19. Finzi, M., Welling, M., Wilson, A.G.: A practical method for constructing equivari-
ant multilayer perceptrons for arbitrary matrix groups. In: International conference
on machine learning. pp. 3318–3328. PMLR (2021)
20. Geiger, M., Smidt, T.: e3nn: Euclidean neural networks. arXiv preprint
arXiv:2207.09453 (2022)
21. Hagmann, P., Jonasson, L., Maeder, P., Thiran, J.P., Wedeen, V.J., Meuli, R.: Un-
derstanding diffusion mr imaging techniques: from scalar diffusion-weighted imag-
ing to diffusion tensor imaging and beyond. Radiographics 26(suppl_1), S205–S223
(2006)
22. Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks.
Advances in neural information processing systems 28 (2015)
23. Jeurissen, B., Tournier, J.D., Dhollander, T., Connelly, A., Sijbers, J.: Multi-tissue
constrained spherical deconvolution for improved analysis of multi-shell diffusion
mri data. NeuroImage 103, 411–426 (2014)
24. Jian, B., Vemuri, B.C.: A unified computational framework for deconvolution to
reconstruct multiple fibers from diffusion weighted mri. IEEE Trans. on Med. Imag-
ing 26(11), 1464–1471 (2007)
25. Jian, B., Vemuri, B.C., Özarslan, E., Carney, P.R., Mareci, T.H.: A novel tensor
distribution model for the diffusion-weighted mr signal. Neuroimage 37(1), 164–176
(2007)
26. Liu, R., Lauze, F., et al.: SE(3)Group Conv. NN and a
Study on Group Convs. and Equivariance for DWI Segmen-
A Steerable Deep Network for Model-Free Diffusion MRI Registration 15
tation (2023). https://doi.org/10.21203/rs.3.rs-2531880/v1 ,
https://europepmc.org/article/PPR/PPR624395, preprint
27. Lorenzi, M., Pennec, X.: Geodesics, parallel transport & one-parameter subgroups
for diffeomorphic image registration. International Journal of Computer Vision
105(2), 111–127 (2013)
28. McGraw, T., Vemuri, B.C., Chen, Y., Rao, M., Mareci, T.: Dt-mri denoising and
neuronal fiber tracking. Medical Image Analysis 8(2), 95–111 (2004)
29. Müller, P., Golkov, V., Tomassini, V., Cremers, D.: Rotation-equivariant deep
learning for diffusion mri. arXiv preprint arXiv:2102.06942 (2021)
30. Özarslan, E., Koay, C.G., Shepherd, T.M., Komlosh, M.E., İrfanoğlu, M.O., Pier-
paoli, C., Basser, P.J.: Mean apparent propagator (map) mri: a novel diffusion
imaging method for mapping tissue microstructure. NeuroImage 78, 16–32 (2013)
31. Raffelt, D., Dhollander, T., Tournier, J.D., Tabbara, R., Smith, R.E., Pierre, E.,
Connelly, A.: Bias field correction and intensity normalisation for quantitative anal-
ysis of apparent fibre density. In: Proc. Intl. Soc. Mag. Reson. Med. vol. 25, p. 3541
(2017)
32. Raffelt, D., Tournier, J.D., Crozier, S., Connelly, A., Salvado, O.: Reorientation
of fiber orientation distributions using apodized point spread functions. Magnetic
resonance in medicine 67(3), 844–855 (2012)
33. Raffelt, D., Tournier, J.D., Fripp, J., Crozier, S., Connelly, A., Salvado, O.: Sym-
metric diffeomorphic registration of fibre orientation distributions. Neuroimage
56(3), 1171–1180 (2011)
34. Ramachandran, P., Zoph, B., Le, Q.V.: Searching for activation functions. arXiv
preprint arXiv:1710.05941 (2017)
35. Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomed-
ical image segmentation. In: Medical image computing and computer-assisted
intervention–MICCAI 2015: 18th international conference, Munich, Germany, Oc-
tober 5-9, 2015, proceedings, part III 18. pp. 234–241. Springer (2015)
36. Sriperumbudur, B.K., Gretton, A., Fukumizu, K., Schölkopf, B., Lanckriet, G.R.:
Hilbert space embeddings and metrics on probability measures. The Journal of
Machine Learning Research 11, 1517–1561 (2010)
37. Tao, X., Miller, J.V.: A method for registering diffusion weighted magnetic res-
onance images. In: International Conference on Medical Image Computing and
Computer-Assisted Intervention. pp. 594–602. Springer (2006)
38. Tournier, D.J., Calamante, F., Connelly, A.: Robust determination of the fibre
orientation distribution in diffusion mri: non-negativity constrained super-resolved
spherical deconvolution. Neuroimage 35(4), 1459–1472 (2007)
39. Tournier, J.D., Calamante, F., Gadian, D.G., Connelly, A.: Direct estimation of the
fiber orientation density function from diffusion-weighted mri data using spherical
deconvolution. Neuroimage 23(3), 1176–1185 (2004)
40. Tournier, J.D., Smith, R., Raffelt, D., Tabbara, R., Dhollander, T., Pietsch, M.,
Christiaens, D., Jeurissen, B., Yeh, C.H., Connelly, A.: Mrtrix3: A fast, flexible
and open software framework for medical image processing and visualisation. Neu-
roimage 202, 116137 (2019)
41. Tschumperle, D., Deriche, R.: Variational frameworks for dt-mri estimation, regu-
larization and visualization. In: Proc. of the IEEE International Conf. on Computer
Vision. pp. 116–121 (2003)
42. Van Essen, D.C., Ugurbil, K., Auerbach, E., Barch, D., Behrens, T.E., Bucholz, R.,
Chang, A., Chen, L., Corbetta, M., Curtiss, S.W., et al.: The human connectome
project: a data acquisition perspective. Neuroimage 62(4), 2222–2231 (2012)
16 G. Cortés, B.C. Vemuri
43. Vercauteren, T., Pennec, X., Perchant, A., Ayache, N.: Diffeomorphic demons:
Efficient non-parametric image registration. Neuroimage 45(1), S61–S72 (2009)
44. Wang, F., C, V.B.: Non-rigid multi-modal image registration using cross-
cumulative residual entropy. International Journal of Computer Vision 74, 201–215
(2007)
45. Wang, Z., Vemuri, B.C., Chen, Y., Mareci, T.H.: A constrained variational principle
for direct estimation and smoothing of the diffusion tensor field from complex dwi.
IEEE Transactions on Medical Imaging 23(8), 930–939 (2004)
46. Wasserman, A.G.: Equivariant differential topology. Topology 8(2), 127–150 (1969)
47. Wasserthal, J., Neher, P., Maier-Hein, K.H.: Tractseg-fast and accurate white mat-
ter tract segmentation. NeuroImage 183, 239–253 (2018)
48. Weiler, M.: Equivariant and Coordinate Independent Convolutional Networks. Phd
thesis, University of Amsterdam (2023)
49. Weiler, M., Geiger, M., Welling, M., Boomsma, W., Cohen, T.S.: 3d steerable cnns:
Learning rotationally equivariant features in volumetric data. Advances in Neural
Information Processing Systems 31 (2018)
50. Yang, X., Kwitt, R., Styner, M., Niethammer, M.: Quicksilver: Fast predictive
image registration–a deep learning approach. NeuroImage 158, 378–396 (2017)
51. Zhang, F., Wells, W.M., O’Donnell, L.J.: Deep diffusion mri registration (ddmreg):
a deep learning method for diffusion mri registration. IEEE transactions on medical
imaging 41(6), 1454–1467 (2021)
52. Zhang, H., Yushkevich, P.A., Alexander, D.C., Gee, J.C.: Deformable registration
of diffusion tensor mr images with explicit orientation optimization. Medical image
analysis 10(5), 764–785 (2006)
53. Zhang, P., Niethammer, M., Shen, D., Yap, P.T.: Large deformation diffeomor-
phic registration of diffusion-weighted imaging data. Medical image analysis 18(8),
1290–1298 (2014)
54. Zhdanov, M., Hoffmann, N., Cesa, G.: Implicit convolutional kernels for steerable
cnns. Advances in Neural Information Processing Systems 36 (2024)
55. Özarslan, E., Shepherd, T.M., Vemuri, B.C., Blackband, S.J., Mareci, T.H.: Reso-
lution of complex tissue microarchitecture using the diffusion orientation transform
(dot). Neuroimage (3), 1086–1103 (2005)
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
The individual course of white matter fiber tracts is an important factor for analysis of white matter characteristics in healthy and diseased brains. Diffusion-weighted MRI tractography in combination with region-based or clustering-based selection of streamlines is a unique combination of tools which enables the in-vivo delineation and analysis of anatomically well-known tracts. This, however, currently requires complex, computationally intensive processing pipelines which take a lot of time to set up. TractSeg is a novel convolutional neural network-based approach that directly segments tracts in the field of fiber orientation distribution function (fODF) peaks without using tractography, image registration or parcellation. We demonstrate that the proposed approach is much faster than existing methods while providing unprecedented accuracy, using a population of 105 subjects from the Human Connectome Project. We also show initial evidence that TractSeg is able to generalize to differently acquired data sets for most of the bundles. The code and data are openly available at https://github.com/MIC-DKFZ/TractSeg/ and https://doi.org/10.5281/zenodo.1088277, respectively.
Article
Full-text available
This paper introduces Quicksilver, a fast deformable image registration method. Quicksilver registration for image-pairs works by patch-wise prediction of a deformation model based directly on image appearance. A deep encoder-decoder network is used as the prediction model. While the prediction strategy is general, we focus on predictions for the Large Deformation Diffeomorphic Metric Mapping (LDDMM) model. Specifically, we predict the momentum-parameterization of LDDMM, which facilitates a patch-wise prediction strategy while maintaining the theoretical properties of LDDMM, such as guaranteed diffeomorphic mappings for sufficiently strong regularization. We also provide a probabilistic version of our prediction network which can be sampled during test time to calculate uncertainties in the predicted deformations. Finally, we introduce a new correction network which greatly increases the prediction accuracy of an already existing prediction network. Experiments are conducted for both atlas-to-image and image-to-image registrations. These experiments show that our method accurately predicts registrations obtained by numerical optimization, is very fast, and achieves state-of-the-art registration results on four standard validation datasets. Quicksilver is freely available as open-source software.
Conference Paper
Full-text available
To overcome the fact that the fibre orientation distribution (FOD) from constrained spherical deconvolution (CSD) assumes a single-fibre white matter (WM) response function—and is thus inappropriate and distorted in voxels containing grey matter (GM) or cerebrospinal fluid (CSF)—multi-shell multi-tissue CSD (MSMT-CSD) was proposed. MSMT-CSD can resolve WM, GM and CSF signal contributions, but requires multi-shell data. Very recently, we proposed a novel method that can achieve the same results using just single-shell data. We refer to this method as "single-shell 3-tissue CSD" (SS3T-CSD). Both MSMT-CSD and SS3T-CSD require WM, GM and CSF response functions. These can be obtained from manually selected exemplary voxels of the tissue classes, or via the procedure described initially in the MSMT-CSD paper, which relies on a highly accurately co-registered T1 image. We propose an unsupervised procedure that does not depend on a T1 image, nor registration, and works for both single-shell and multi-shell data.
Article
In this paper, we present a deep learning method, DDMReg, for accurate registration between diffusion MRI (dMRI) datasets. In dMRI registration, the goal is to spatially align brain anatomical structures while ensuring that local fiber orientations remain consistent with the underlying white matter fiber tract anatomy. DDMReg is a novel method that uses joint whole-brain and tract-specific information for dMRI registration. Based on the successful VoxelMorph framework for image registration, we propose a novel registration architecture that leverages not only whole brain information but also tract-specific fiber orientation information. DDMReg is an unsupervised method for deformable registration between pairs of dMRI datasets: it does not require nonlinearly pre-registered training data or the corresponding deformation fields as ground truth. We perform comparisons with four state-of-the-art registration methods on multiple independently acquired datasets from different populations (including teenagers, young and elderly adults) and different imaging protocols and scanners. We evaluate the registration performance by assessing the ability to align anatomically corresponding brain structures and ensure fiber spatial agreement between different subjects after registration. Experimental results show that DDMReg obtains significantly improved registration performance compared to the state-of-the-art methods. Importantly, we demonstrate successful generalization of DDMReg to dMRI data from different populations with varying ages and acquired using different acquisition protocols and different scanners.
Article
Geometric deep learning is a relatively nascent field that has attracted significant attention in the past few years. This is partly due to the availability of data acquired from non-euclidean domains or features extracted from euclidean-space data that reside on smooth manifolds. For instance, pose data commonly encountered in computer vision reside in Lie groups, while covariance matrices that are ubiquitous in many fields and diffusion tensors encountered in medical imaging domain reside on the manifold of symmetric positive definite matrices. Much of this data is naturally represented as a grid of manifold-valued data. In this paper we present a novel theoretical framework for developing deep neural networks to cope with these grids of manifold-valued data inputs. We also present a novel architecture to realize this theory and call it the ManifoldNet. Analogous to vector spaces where convolutions are equivalent to computing weighted sums, manifold-valued data ‘convolutions’ can be defined using the weighted Fréchet Mean ( wFM{\sf wFM} ). (This requires endowing the manifold with a Riemannian structure if it did not already come with one.) The hidden layers of ManifoldNet compute wFM{\sf wFM} s of their inputs, where the weights are to be learnt. This means the data remain manifold-valued as they propagate through the hidden layers. To reduce computational complexity, we present a provably convergent recursive algorithm for computing the wFM{\sf wFM} . Further, we prove that on non-constant sectional curvature manifolds, each wFM{\sf wFM} layer is a contraction mapping and provide constructive evidence for its non-collapsibility when stacked in layers. This captures the two fundamental properties of deep network layers. Analogous to the equivariance of convolution in euclidean space to translations, we prove that the wFM{\sf wFM} is equivariant to the action of the group of isometries admitted by the Riemannian manifold on which the data reside. To showcase the performance of ManifoldNet, we present several experiments using both computer vision and medical imaging data sets.
Article
MRtrix3 is an open-source, cross-platform software package for medical image processing, analysis and visualisation, with a particular emphasis on the investigation of the brain using diffusion MRI. It is implemented using a fast, modular and flexible general-purpose code framework for image data access and manipulation, enabling efficient development of new applications, whilst retaining high computational performance and a consistent command-line interface between applications. In this article, we provide a high-level overview of the features of the MRtrix3 framework and general-purpose image processing applications provided with the software.
Article
Classical deformable registration techniques achieve impressive results and offer a rigorous theoretical treatment, but are computationally intensive since they solve an optimization problem for each image pair. Recently, learning-based methods have facilitated fast registration by learning spatial deformation functions. However, these approaches use restricted deformation models, require supervised labels, or do not guarantee a diffeomorphic (topology-preserving) registration. Furthermore, learning-based registration tools have not been derived from a probabilistic framework that can offer uncertainty estimates. In this paper, we build a connection between classical and learning-based methods. We present a probabilistic generative model and derive an unsupervised learning-based inference algorithm that uses insights from classical registration methods and makes use of recent developments in convolutional neural networks (CNNs). We demonstrate our method on a 3D brain registration task for both images and anatomical surfaces, and provide extensive empirical analyses of the algorithm. Our principled approach results in state of the art accuracy and very fast runtimes, while providing diffeomorphic guarantees. Our implementation is available online at http://voxelmorph.csail.mit.edu.
Article
We present VoxelMorph, a fast learning-based framework for deformable, pairwise medical image registration. Traditional registration methods optimize an objective function for each pair of images, which can be time-consuming for large datasets or rich deformation models. In contrast to this approach and building on recent learning-based methods, we formulate registration as a function that maps an input image pair to a deformation field that aligns these images. We parameterize the function via a convolutional neural network and optimize the parameters of the neural network on a set of images. Given a new pair of scans, VoxelMorph rapidly computes a deformation field by directly evaluating the function. In this paper, we explore two different training strategies. In the first (unsupervised) setting, we train the model to maximize standard image matching objective functions that are based on the image intensities. In the second setting, we leverage auxiliary segmentations available in the training data. We demonstrate that the unsupervised model’s accuracy is comparable to the state-of-the-art methods while operating orders of magnitude faster. We also show that VoxelMorph trained with auxiliary data improves registration accuracy at test time and evaluate the effect of training set size on registration. Our method promises to speed up medical image analysis and processing pipelines while facilitating novel directions in learning-based registration and its applications. Our code is freely available at https://github.com/voxelmorph/voxelmorph .
Article
Many signal processing problems involve data whose underlying structure is non-Euclidean, but may be modeled as a manifold or (combinatorial) graph. For instance, in social networks, the characteristics of users can be modeled as signals on the vertices of the social graph. Sensor networks are graph models of distributed interconnected sensors, whose readings are modelled as time-dependent signals on the vertices. In genetics, gene expression data are modeled as signals defined on the regulatory network. In neuroscience, graph models are used to represent anatomical and functional structures of the brain. Modeling data given as points in a high-dimensional Euclidean space using nearest neighbor graphs is an increasingly popular trend in data science, allowing practitioners access to the intrinsic structure of the data. In computer graphics and vision, 3D objects are modeled as Riemannian manifolds (surfaces) endowed with properties such as color texture. Even more complex examples include networks of operators, e.g., functional correspondences or difference operators in a collection of 3D shapes, or orientations of overlapping cameras in multi-view vision ("structure from motion") problems. The complexity of geometric data and the availability of very large datasets (in the case of social networks, on the scale of billions) suggest the use of machine learning techniques. In particular, deep learning has recently proven to be a powerful tool for problems with large datasets with underlying Euclidean structure. The purpose of this paper is to overview the problems arising in relation to geometric deep learning and present solutions existing today for this class of problems, as well as key difficulties and future research directions.