Available via license: CC BY 4.0
Content may be subject to copyright.
arXiv:2501.04794v1 [eess.IV] 8 Jan 2025
A Steerable Deep Network for Model-Free
Diffusion MRI Registration
Gianfranco Cortés and Baba C. Vemuri
Department of CISE, University of Florida, Gainesville, FL, USA
{gcortes, vemuri}@ufl.edu
Abstract. Nonrigid registration is vital to medical image analysis but
remains challenging for diffusion MRI (dMRI) due to its high-dimensional,
orientation-dependent nature. While classical methods are accurate, they
are computationally demanding, and deep neural networks, though effi-
cient, have been underexplored for nonrigid dMRI registration compared
to structural imaging. We present a novel, deep learning framework for
model-free, nonrigid registration of raw diffusion MRI data that does not
require explicit reorientation. Unlike previous methods relying on derived
representations such as diffusion tensors or fiber orientation distribution
functions, in our approach, we formulate the registration as an equiv-
ariant diffeomorphism of position-and-orientation space. Central to our
method is an SE(3)-equivariant UNet that generates velocity fields while
preserving the geometric properties of a raw dMRI’s domain. We intro-
duce a new loss function based on the maximum mean discrepancy in
Fourier space, implicitly matching ensemble average propagators across
images. Experimental results on Human Connectome Project dMRI data
demonstrate competitive performance compared to state-of-the-art ap-
proaches, with the added advantage of bypassing the overhead for esti-
mating derived representations. This work establishes a foundation for
data-driven, geometry-aware dMRI registration directly in the acquisi-
tion space.
Keywords: Diffusion MRI · Steerable CNN · Special Euclidean Group
· Nonrigid Registration · RKHS
1 Introduction
Computing a nonrigid deformation mapping one image, Sm, to another, Sf,
is an essential task in medical image analyses, most notably for cross-subject
comparisons, population-specific atlas construction, and atlas-based segmenta-
tion. This task, known as image registration, possesses well-established classical
[12,44,2,43,27] and deep learning [3,14,15] solutions that are readily applicable
to 3D scalar-valued modalities of the form R3→R, such as structural magnetic
resonance imaging (sMRI) and computed tomography (CT). However, registra-
tion of diffusion MRI (dMRI), an imaging modality primarily used to probe
neural microstructure, requires additional care.
2 G. Cortés, B.C. Vemuri
Within each voxel, diffusion-weighted imaging captures the diffusion profile
of water molecules, which is naturally constrained (or not) by the surrounding
tissue. The diffusion profile at a position p∈R3is characterized by the ensem-
ble average propagator (EAP), which is a probability density Pp(r) : R3→R≥0
describing the probability of a water molecule being displaced by rwithin the ef-
fective diffusion time. Practitioners ultimately care about resolving the dominant
directions of diffusion at each voxel in order to perform downstream analysis like
tractography [28,21,10]. As a means to this end, assumptions are imposed on the
unknown EAP (e.g. Gaussian) in order to fit a simplified model (e.g. a diffusion
tensor) to the raw signal, with the hope that the principal directions of diffu-
sion are accurately captured. We refer to such models as derived representations,
examples of which include diffusion tensors (DTs) [4,45,41], fiber orientation dis-
tribution functions (fODFs) [39,38,24], orientation distribution functions [55,16],
Gaussian mixtures [25], and Hermite basis functions [30].
It follows that diffusion-weighted images and their derived representations
carry orientation-dependent information at each voxel that must be properly
transformed (i.e. reoriented) under a spatial deformation Φ:R3→R3. While
the details of how this reorientation is carried out will depend on the derived
representation and domain-specific assumptions (e.g. finite strain), it is always
grounded in the fact that if pis displaced to Φ(p), then Sm(p)is transformed
according to some action of the Jacobian dΦp. The literature is filled with ap-
proaches to register both raw dMRI [18,53] and derived representations [1,52,11,33,32]
that extend the scalar-valued setting to include this required reorientation step.
These methods adhere to the classical image registration formulation where a
pair of images is iteratively aligned by minimizing a dissimilarity, as opposed to
methods that exploit deep learning.
While various deep learning methodologies [50,3,14,15] have been developed
to speedily handle the 3D, scalar-valued registration problem at a fraction of
classical runtimes, we are only aware of two data-driven frameworks [51,6] that
are designed to address the diffusion-weighted setting. Both of these methods
rely on VoxelMorph (VM)-inspired [3] backbones consisting of a UNet [35] that
estimates a nonlinear deformation Φand a spatial transformer [22] that applies Φ
to the moving image Smand handles reorientation. The key distinction between
them is their choice of input features. The first of these, DDMReg [51], requires
a fractional anisotropy (FA) map and tract orientation maps (TOMs) for each
diffusion-weighted image. Note that since these are scalar- and vector-valued
features, respectively, they can be fed to a vanilla VM backbone as is. On the
other hand, the method of [6], which we will refer to as MVCReg, requires square
root density parameterized fODF maps as input. Since these are manifold-valued
derived representations, the authors are forced to replace the VM backbone’s
vanilla convolutions with manifold-valued convolutions [9,5].
In this work, we demonstrate that data-driven registration should and can
be performed on the raw dMRI data. Thus our approach is “model-free” in this
sense. To facilitate the discussion, we refer to the scalar-valued setting as the
anat-pregistration problem, and its aforementioned extension to the diffusion-
A Steerable Deep Network for Model-Free Diffusion MRI Registration 3
Type anat-pdiff-pdiff-pq
Modalities sMRI, CT, FA DT, fODF, GMM raw dMRI
Mechanism
Driving position position orientation
position
Schematic
Registration R3R
R3R
Sf
Φ
Sm
R3(R3→R)
R3(R3→R)
Sf
Φ
Sm
reorient
R3⊕R3R
R3⊕R3R
Sf
Φ
Sm
Fig. 1. Comparison of three registration scenarios in which we try to estimate a defor-
mation Φthat minimizes L(Sf, Sm◦Φ)for some dissimilarity L.
weighted regime as the diff-pregistration problem (see Fig. 1). Although de-
rived representations offer the conceptual benefit of offloading the orientational
information to the codomain, we argue that their usage in registration suffers
from two pitfalls: (1) inherent limitations of the model chosen to approximate
the acquired data manifest themselves as lost information and (2) the orienta-
tional information is not directly leveraged to drive the registration. This second
point motivates our formulation of the diff-pq registration problem, in which
the orientational information is pulled back into the domain, thus allowing the
deformation Φto be a function of both position and orientation (see Fig. 1).
This change in perspective demands that we borrow tools from the field of
geometric deep learning (GDL) [7,48], with the aim of generating deformations
Φthat respect the geometry of the newly introduced non-Euclidean domain. We
introduce several novel contributions including (1) the diff-pq registration prob-
lem and its formalization, which in particular avoids explicit Jacobian estimation
and reorientation, (2) an SE(3)-equivariant, end-to-end, VM-inspired network
capable of preserving the geometry of a raw diffusion-weighted signal’s domain,
(3) use of maximum mean discrepancy (MMD) [36] in the Fourier space as a
loss function for dMRI registration, and (4) an experimental evaluation on HCP
dMRI scans comparing our method to SOTA dMRI registration approaches.
The rest of the paper is organized as follows: In Section 2, we briefly present
group-theoretic prerequisites before delving into the theoretical underpinnings of
our proposed registration framework. In Section 3, we present the construction
of the SE(3)-equivariant convolution layers and our VM-inspired architecture.
Section 4 contains experimental results and conclusions are drawn in Section 5.
2 Background
2.1 Representations and Equivariance
Definition 1. An action of a group Gon a space Mis a mapping (g, p)7→ g◮p
satisfying the following properties for all g, g′∈Gand p∈M: (1) e◮p=p
where eis the group identity and (2) g◮(g′◮p) = (gg′)◮p.
4 G. Cortés, B.C. Vemuri
Definition 2. Ad-dimensional representation of a group Gis a map ρ:G→
GL(d)that satisfies ρ(gg′) = ρ(g)ρ(g′)for all g, g′∈G.
Definition 3. Let Mand Nbe spaces with a group Gacting on each of them.
A function Ψ:M→Nis equivariant to Gif Ψ(g◮Mp) = g◮NΨ(p)for all
g∈Gand p∈M.
2.2 The Geometry of Diffusion-Weighted Images
To a first approximation, a diffusion-weighted acquisition sequence can be de-
scribed as follows: first, select a finite number of directions gj∈S2⊂R3along
which diffusion-sensitized magnetic field gradients are applied; second, acquire
a3D volume for each gj. At first glance, this would suggest that a raw dMRI
is a function S:R3×S2→R, since there is a 3D volume S(−,g) : R3→R
for every direction g∈S2. However, there are two additional idiosyncrasies of a
diffusion-weighted acquisition sequence that need to be introduced:
a) For a fixed direction g, we can scale the strength of the applied diffusion-
sensitized magnetic field gradient, where the strength is proportional to (the
square root of ) the so-called b-value. An important and required special case
is when b= 0.
b) For a fixed b-value, a dMRI exhibits antipodal symmetry, i.e. the 3D volume
acquired along gis equal to the 3D volume acquired along −g.
By (a), there exists a “shell” of directions for each b-value b≥0(where b= 0
corresponds to a degenerate shell), meaning we should augment our domain to
R3×(S2×R+)
| {z }
b>0
∪(R3× {0})
|{z }
b=0
≃R3×R3.(1)
By (b), a dMRI S:R3×R3→Rmust satisfy
S(p,q) = S(p,−q)for all p,q∈R3.(2)
This constraint is equivalent to saying that fis a function
R3×(RP2×R+)∪(R3× {0})→R,(3)
where RP2is the real projective plane obtained by identifying antipodes on S2.
Nevertheless, for notational convenience, we will think of a dMRI as a function
R3
|{z}
p-space
×R3
|{z}
q-space
→R(4)
satisfying Eqn. 2, where we deliberately split the domain into p-space and q-
space. The reasoning for this is twofold. From an acquisition perspective, p-space
and q-space are sampled differently, with p-space being sampled on a fixed, uni-
form Cartesian grid and q-space being sampled at q=0and on approximately
A Steerable Deep Network for Model-Free Diffusion MRI Registration 5
uniform spherical grids corresponding to a small number of selected b-values.
This is called a multi-shell acquisition, and a single-shell acquisition is the spe-
cial case where the signal is only sampled at one nonzero b-value. From a geo-
metric perspective, Eqns. 2 and 4 are still not enough to fully characterize the
diffusion-weighted signal, because R3×R3∼
=R6only describes the domain as
a set. In fact, p-space and q-space are physically coupled in a manner that can
only be captured by an action (see Def. 1) of SE(3) = R3⋊SO(3), the group of
3D roto-translations. This action is given by
(t, r)◮(p,q) = (rp+t, rq)(5)
for all (t, r)∈SE(3) and (p,q)∈R3×R3. This entails that any dMRI Smust
appropriately transform under a roto-translation of the domain, i.e.
[(t, r)◮S](p,q) = S(r−1(p−t), r−1p).(6)
To prevent ourselves from viewing the domain R3×R3as just a set in isolation,
we will follow [29] in using the notation R3⊕R3as a reminder that p-space and
q-space are coupled in a manner less trivial than a direct product.
2.3 The diff-pq Registration Problem
We demonstrated in Section 2.2 that the natural domain for a raw diffusion-
weighted signal is R3⊕R3, where the first factor denotes p-space and the sec-
ond factor denotes q-space. In the case of a derived representation, the q-space
is usually offloaded to the codomain, forcing us to formulate the deformation
of one derived representation into another (of same type) as a diff-pregistra-
tion problem. However, much akin to the anat-pregistration problem, we are
now dealing with scalar-valued functions whose alignment is simply a matter of
reparameterizing the domain via an appropriate diffeomorphism Φ. Hence, this
naturally gives way to the diff-pq registration problem, which we state here for
completeness: Given raw diffusion-weighted signals Sm, Sf:R3⊕R3→R, es-
timate a diffeomorphism Φ:R3⊕R3→R3⊕R3that minimizes L(Sf, Sm◦Φ)
for some dissimilarity L.
Notice that the notation R3⊕R3carries weight in this setting, because
a diffeomorphism R3⊕R3→R3⊕R3is not the same as a diffeomorphism
R3×R3→R3×R3. In particular, a diffeomorphism Φ:R3⊕R3→R3⊕R3
must commute with the SE(3) group action in Eqn. 6, a condition that is best
visualized with the following commutative diagram:
R3⊕R3R3⊕R3
R3⊕R3R3⊕R3
Φ
(t,r)◮(t,r)◮
Φ
(7)
The commutative diagram in Eqn. 7 is saying that both paths from the top-left
corner to the bottom-right corner are equivalent. Such a Φis called an equivariant
6 G. Cortés, B.C. Vemuri
diffeomorphism. Therefore, we require some kind of sufficient condition that will
guarantee that we can construct equivariant diffeomorphisms, which will ensure
that we do not violate the physical coupling of p- and q-space. Fortunately, we
can accomplish this by invoking the following theorem [46]:
Theorem 1. [46] Let Mbe a Riemannian manifold with a Lie group Gacting
on it. Let Xbe an equivariant vector field on M, i.e. Xgp =gXpfor all p∈M
and g∈G. Then, the flow generated by Xis an equivariant diffeomorphism.
Hence, it is sufficient for us to generate an equivariant vector (velocity) field,
which we may then integrate to obtain an equivariant diffeomorphism.
Before proceeding, we remind the reader that, in practice, q-space is sam-
pled on a small number of concentric spherical shells. Therefore, although the
diffusion-weighted signal does exist on all of R3⊕R3, it is more prudent to re-
fine our treatment to the constituent shells and piece them back together when
needed. When we restrict ourselves to a single shell we will call the new domain
Ω, which is R3×S2as a set (cf. Eqn. 1). The SE(3) group action (Eqn. 5) is
exactly the same as before, just restricted to this subspace.
2.4 Steerable Convolutions
Using convolutions to generate an equivariant velocity field on Ωimposes two
conditions on the convolutions themselves: (1) the convolutions must be SE(3)-
equivariant and (2) the convolutions must be able to output vector fields on Ω.
Although SE(3)-equivariant layers were developed in [26] to handle raw dMRI
signals, their method relies on the regular group representation which is incapable
of producing vector fields. Instead, we need to extend the steerable convolutions
introduced in [13,8], otherwise known as gauge equivariant convolutions, to the
domain Ω. We now give a brief primer on steerability; see [48] for more detail.
Let Mbe a 5-dimensional Riemannian manifold with structure group SO(5),
the group of 5D rotations. In the steerable setting, feature maps become tensor
fields on M(e.g. vector – order one tensor – fields). A feature map is coordina-
tized w.r.t. a gauge, a local choice of coordinate frame. A steerable convolution
maps an input feature map fin of type ρin to an output feature map fout of
type ρout, where ρin and ρout are group representations that encode the trans-
formation behavior of the tensor components under a change of gauge [48]. Let
fin be a feature map of type ρin and K:R5→Rdout ×din a matrix-valued filter
where din and dout are the dimensions of the input and output tensors, respec-
tively. Letting qv:= exppwpv(exp denotes the Riemannian exponential map
and wp:R5→TpMa gauge), the convolved feature map fout =K ⋆ fin is given
pointwise w.r.t. wpby
fout(p) := ZR5
K(v)ρin(tp←qv)fin(qv)dv, (8)
where tp←qvdenotes the SO(5)-valued gauge transformation taking the frame
on qv(after parallel transportstion to p) to the frame on p. Eqn. 8 is equivariant
A Steerable Deep Network for Model-Free Diffusion MRI Registration 7
to a change of gauge at pif and only if Kis SO(5)-steerable, i.e. Ksatisfies
K(t−1v) = ρout(t−1)K(v)ρin (t)(9)
for all t∈SO(5) and v∈R5[13]. A critical result of [48] shows that, in the case
where M=Ω, steerable convolutions with SO(5)-steerable filters are equivariant
to 3D rototranslations.
2.5 An MMD Loss via Characteristic Functions
We argue that the ultimate goal of raw dMRI registration is not merely to align
the raw signals Sfand Sm, but rather to match their corresponding EAPs. As we
shall see in Section 4, this claim is evidenced by the fact that registration quality
in the dMRI setting is evaluated using white matter fiber bundle segmentations,
which are extracted using fODF peaks (an approximation of EAP peaks). The
relation between the raw signal Sand its associated EAP, P, is given by
Pp(r) = ZR3
e2πiq⊤rE(p,q)dq,(10)
where E(p,q) = S(p,q)/S(p,0)is the signal attenuation. Therefore, if every
diffusion-weighted volume is normalized by the b= 0 volume, the resulting Eis
directly related to the EAP via a Fourier transform. Said differently, for every
position p,E(p,−)is the characteristic function of the probability density Pp.
We now introduce a result from the theory of reproducing kernel Hilbert spaces
(RKHSs) that permits us to indirectly minimize the maximum mean discrepancy
(MMD) between EAPs via a modified L2loss applied to the signal attenuations.
Suppose Pand Qare two distributions on R3with corresponding character-
istic functions (Fourier transforms) φPand φQrespectively, and let kbe a kernel
function on R3, associated with the RKHS Hk. The MMD between Pand Qis
defined as
MMD(P, Q) = sup
f∈Hk,kfk≤1ZR3
fdP−ZR3
fdQ.(11)
Bochner’s theorem [36, Thm. 3] states that there exists a finite Borel measure
Λon R3such that
k(r,r′) = ZR3
e−i(r−r′)⊤qdΛ(q).(12)
It was shown in [36] that
MMD(P, Q) = kφP−φQkL2(R3,Λ),(13)
i.e. the MMD between Pand Qequals the L2distance between their character-
istic functions with respect to the measure Λ. In this work, we choose Λto be the
multivariate Gaussian distribution (measure) N(0, σ 2I3), whose corresponding
kernel kis exactly the Gaussian kernel k(r,r′) = exp(−σ2
2kr−r′k2). In this case,
the L2loss between Pand Qequals
MMD(P, Q) = E|φP(X)−φQ(X)|2, X ∼N(0, σ2I3).(14)
8 G. Cortés, B.C. Vemuri
Letting E(p,q) := |Ef(p,q)−[Em◦Φ](p,q)|2, this translates to the following
loss in our setting:
L(Ef, Em◦Φ) = CX
p,q
E(p,q)·e−kqk2/2σ2· kqk2+λX
p,q
kv(p,q)k2.(15)
Here, λis the regularization hyperparameter, vis the velocity field generated by
the UNet described in Section 3, and C= (2πσ2)−3
2is a normalization constant.
Note that the guarantee provided by Eqn. 13 would not hold if we simply resorted
to a standard MSE loss, since a Lebesgue measure in isolation is not finite.
3 Implementation
3.1 Constructing SE(3)-Equivariant Convolution Layers
The most challenging obstacle to implementing the steerable convolutions of
Section 2.4 in practice is generating an SO(5)-steerable filter that satisfies Eqn.
9. Historically, this is done in one of two ways: (1) analytically solve for a basis
of the linear subspace of filters satisfying Eqn. 9, or (2) parameterize the con-
volution filter using an equivariant MLP. We opt for the second approach since
it circumvents the need to solve for an explicit basis, and it has shown superior
performance in equivariant tasks [54]. We can invoke the following lemma to
parameterize a filter that satisfies Eqn. 9 using an MLP:
Lemma 1. [54] If a filter Kis parameterized by an SO(5)-equivariant MLP
with input representation ρst and output representation ρ⊗:= ρin ⊗ρout, then
the filter satisfies the steerability constraint in Equation 9.
In Lemma 1 above, ρst denotes the standard representation given by ρst(g) = g
and ρ⊗denotes the tensor product representation given by ρ⊗(g) = ρin (g)⊗
ρout(g). We can construct an SO(5)-equivariant MLP matching the description
of Lemma 1 by using the open source method of [19]. To summarize briefly, the
authors of [19] efficiently build equivariant MLPs by decomposing the equivari-
ance constraint into a finite set of simpler constraints involving the Lie group’s
generators, which are elements of its Lie algebra. These constraints can be solved
efficiently at initialization time, and are thus a one-time computational cost.
3.2 Network Input
To prevent wasting model capacity on learning large, parameterizable motion
between images, we follow the preparatory step of affinely aligning a moving
image to the fixed image, as suggested in [6]. This is important for two reasons.
Firstly, nonlinear registration algorithms generally require a sensible initializa-
tion to perform well. Secondly, since VM-inspired registration makes the implicit
assumption that the moving and fixed images are roughly aligned in p-space be-
fore concatenation, we need to ensure that this assumption is also met in q-space
so that concatenation remains meaningful. Therefore, we take advantage of the
A Steerable Deep Network for Model-Free Diffusion MRI Registration 9
affine pre-alignment step to also resample the moving image’s q-space at the
fixed image’s q-vectors. The framework of [37,18] permits us to perform both
tasks in one step using angular interpolation (see Eqn. 19).
As motivated in Section 2.5, we then normalize the diffusion-weighted vol-
umes by the mean b= 0 volume, and thus we now call the moving and fixed
images Emand Ef, respectively. Finally, to address noise in the raw signal at-
tenuations, we apply a low-pass filter to the training data. For a fixed voxel,
the intensities on a given shell are expanded in terms of a truncated spherical
harmonic basis (ℓmax = 5). This yields Pℓmax
ℓ=0 (2ℓ+1) coefficients per shell. These
coefficients are then smoothed spatially across voxels using a Gaussian filter. The
signal attenuations are subsequently reconstructed from the smoothed spherical
harmonic coefficients, yielding a denoised representation suitable for training.
3.3 Network Architecture
In the spirit of DDMReg [51] and MVCReg [6], we continue the successful trend
of using a VM-inspired backbone to perform image registration, while also in-
corporating the crucial property of SE(3)-equivariance. The first module is the
UNet, which is responsible for generating a velocity field von the domain Ω.
Our UNet has an encoder depth of 3. Each layer consists of an SE(3)-equivariant
convolution as described in Sections 2.4 and 3.1, followed by swish nonlineari-
ties [34] on scalar features and gated nonlinearities [49] on higher order features,
followed by e3nn-inspired batch normalization [20]. We interleave the encoding
layers with average pooling across p-space.
Next, inspired by [14], a scaling-and-squaring layer integrates the velocity
field vto apply a diffeomorphism Φ. We relate vto Φvia the differential equation
dΦ(t)
dt=v(Φ(t)), Φ(0) =id (16)
by stipulating that Φ=Φ(1). Hence, Φ= exp(v), and scaling-and-squaring is a
numerical technique that approximates this exponential map. This is done by
first scaling vby 2−Nto create an incrementally small displacement Φ(0) (p,g) =
exp(2−Nv(p,g)). Recall (p,g)∈Ω(see Section 2.2). Then, the displacement is
squared iteratively Ntimes:
Φ(j+1)(p,g) = Φ(j)(Φ(j)(p,g)), j = 0,1,...,N −1.(17)
After Nsquaring steps, Φ(N)(p,g)approximates Φ(p,g) = exp(v(p,g)). We
set N= 4 in our implementation. We represent a tangent vector v(p,g)as a
6D vector whose first three components are the tangent vector vR3and whose
last three components are the tangent vector vS2(embedded in R3). Using this
representation, we have that
exp(v(p,g)) = p+vR3,cos(||vS2||)g+ sin(||vS2||)vS2
||vS2|| .(18)
10 G. Cortés, B.C. Vemuri
moving image ( )
velocity
field
integration
layer
fixed image ( )spatial
transform
warped ( )
loss ( )
Fig. 2. Overview of the proposed registration pipeline, parameterized by an SE(3)-
equivariant UNet gθ.
Finally, a spatial transformer module tailored to the manifold Ωapplies the
computed displacement Φto the moving image Em, yielding the warped image.
This is done by using trilinear interpolation in p-space and angular interpolation
in q-space [37]. Given the signal attenuation Esampled at orientations gj, we
interpolate the attenuation value at (p,g)∈Ωas
E(p,g) = PjwjE(p,gj)
Pjwj
, wj=e−(arccos(|g⊤gj|))2/2σ2(19)
where each wjis a spherical radial basis function (RBF) centered at gjwith
Gaussian kernel and smoothness parameter σ. The absolute value comes from
Eqn. 2. We set σ= 0.1in our implementation. See Figure 2 for a schematic of
our network architecture.
4 Experiments
4.1 dMRI Registration Applied to HCP Data
Dataset We conducted training and evaluation using minimally preprocessed
dMRI data from the Human Connectome Pro ject (HCP) Young Adult dataset.
This dataset contains dMRI brain scans from 1,200 individuals aged 22–35. De-
tailed information about the acquisition parameters, subject selection criteria,
and preprocessing steps can be found in the original HCP study [42]. For our
analysis, we selected the same 400 subjects as in MVCReg to facilitate a more
direct performance comparison. Using MRtrix3 [40], 51 (25 validation subjects,
25 testing subjects, and 1 fixed subject) of these subjects underwent three-tissue
response function estimation [17] and multi-shell multi-tissue constrained spher-
ical deconvolution [23] to yield fODF maps. These were subsequently intensity
normalized and bias field corrected [31] before finally extracting the fODF peaks,
which are needed for model evaluation.
A Steerable Deep Network for Model-Free Diffusion MRI Registration 11
Table 1. Test time performance of various registration methods, average over all 25
test subjects and all 72 available white matter tracts.
Method Ours mrregister VoxelMorph DDMreg MVCReg MVVSReg
Modality raw dMRI fODF FA FA, TOM fODF fODF
Dice 0.7468 0.7601 0.7126 0.7417 0.7317 0.7493
Evaluation We evaluate registration quality by comparing the overlap of known
white matter tracts in the warped and fixed image. We measure this overlap using
the Dice score. To generate segmentations of white matter tracts, we use the well-
validated TractSeg segmentation model [47], which is capable of segmenting 72
distinct white matter tracts. The 51 peak maps produced above serve as input
to the TractSeg software. For each of the 25 validation subjects and the fixed
subject, we generated tract orientation maps (TOMs) for all 72 bundles. In
the diff-psetting, validation can be easily performed by applying the estimated
deformation Φ:R3→R3to bundle segmentations of the moving image Smand
measuring the resulting overlap with bundle segmentations of the fixed image
Sf. However, in our diff-pq setting, it does not immediately make sense to warp
a bundle segmentation (a 3D volume) with a deformation Ω→Ω. Therefore, we
are forced to predict our model’s test time behavior by monitoring a different
metric, namely the alignment of warped TOMs with the fixed image’s TOMs. At
test time, the 25 moving test subjects are warped by the model and subsequently
fed to TractSeg to generate bundle segmentation masks for all 72 bundles, which
are then compared with the fixed subject’s bundle segmentation masks.
We selected the same fixed image as in MVCReg [6], and all other subjects
are registered to this fixed image. Hence, there are 399 moving/fixed image pairs,
349 of which are used for training, 25 of which are used for validation, and 25 of
which are used for testing. Since this task is memory-intensive, we are forced to
downsample the spatial dimensions from 145 ×174 ×145 to 72 ×88 ×72.
We trained our model for 500 epochs and we used stochastic gradient descent
(SGD) with Nesterov momentum as an optimizer, which we found worked better
than Adam on our small batch size of 1. Our initial learning rate was 0.001 and
we used a learning rate scheduler that halves the learning rate every 100 epochs.
We compare our method against one classical approach (mrregister) and
four deep learning approaches (VoxelMorph, DDMReg, MVCReg, MVVSReg).
MRtrix3’s mrregister [33,32] is a classical method based on symmetric normal-
ization (SyN). We also include a VoxelMorph baseline trained on FA maps. Both
DDMReg and MVCReg are methods specifically designed for dMRI registration,
as discussed in the introduction. MVVSReg [6] is an extension of MVCReg that
uses second order, manifold-valued convolutions.
Results Quantitative and qualitative results on HCP dMRI registration are pre-
sented in Table 1 and Figure 3, respectively. Figure 3 uses checkerboard masks
in parts (d) and (e) to overlay alternating patches from two images, enabling
12 G. Cortés, B.C. Vemuri
Fig. 3. Visualization of registration results. Top row: b= 0 axial slices with overlaid
fODF maps. Bottom row: b= 0 coronal slices with overlaid fODF maps. (a) Moving
source image, (b) warped source image, (c) fixed target image, (d) checkerboard view
of moving vs. fixed image, (e) checkerboard view of warped vs. fixed image. The high-
lighted ROI shows a tract discontinuity in (d) that is resolved after warping in (e).
Note that SE(3)-equivariance preserves white matter fiber tracts during warping.
visual assessment of registration quality by highlighting alignment differences
between the moving and fixed images (d) or the warped and fixed images (e).
Our method outperforms all data-driven, SOTA methods in Dice score except
MVVSReg. We attribute this to the fact that MVVSReg utilizes higher or-
der convolutions that capture richer features, though at the expense of higher
computation-time/space (memory) complexities [6]. Furthermore, we emphasize
that none of the other methods are capable of registering the raw diffusion-
weighted data. Hence, they require additional offline overhead to estimate input
features. Overall, our method strikes a balance between competitive registration
quality and generality, offering a solution to practitioners who do not want to a
priori commit to a specific derived representation at this stage of their pipeline.
A Steerable Deep Network for Model-Free Diffusion MRI Registration 13
5 Conclusion
In this paper, we have introduced a novel framework for registering raw dMRI
signals that more directly leverages orientational information. We accomplish
this by constructing an SE(3)-equivariant UNet to generate velocity fields on
the raw signal domain, and by applying key theoretical results to ensure that
the physical coupling of p- and q-space is preserved. To our knowledge, we
are the first to present a data-driven technique that registers the raw dMRI
signals, as opposed to first computing some derived representation. Our HCP
dMRI registration experiment demonstrates that the proposed method achieves
competitive performance against state-of-the-art deep learning registration ap-
proaches. In future work, we aim to apply our proposed registration method to
dMRI scans from patient groups with neurodegenerative disorders.
References
1. Alexander, D.C., Pierpaoli, C., Basser, P.J., Gee, J.C.: Spatial transformations of
diffusion tensor magnetic resonance images. IEEE transactions on medical imaging
20(11), 1131–1139 (2001)
2. Avants, B.B., Epstein, C.L., Grossman, M., Gee, J.C.: Symmetric diffeomorphic
image registration with cross-correlation: evaluating automated labeling of elderly
and neurodegenerative brain. Medical image analysis 12(1), 26–41 (2008)
3. Balakrishnan, G., Zhao, A., Sabuncu, M.R., Guttag, J., Dalca, A.V.: Voxelmorph:
a learning framework for deformable medical image registration. IEEE transactions
on medical imaging 38(8), 1788–1800 (2019)
4. Basser, P.J., Mattiello, J., Lebihan, D.: Estimation of the effective self-diffusion
tensor from the nmr spin echo. Journal of Magnetic Resonance Ser. B 103(3),
247–254 (1994)
5. Bouza, J.J., Yang, C.H., Vaillancourt, D., Vemuri, B.C.: A higher order manifold-
valued convolutional neural network with applications to diffusion mri processing.
In: Information Processing in Medical Imaging: 27th International Conference,
IPMI 2021, Virtual Event, June 28–June 30, 2021, Proceedings 27. pp. 304–317.
Springer (2021)
6. Bouza, J.J., Yang, C.H., Vemuri, B.C.: Geometric deep learning for unsupervised
registration of diffusion magnetic resonance images. In: International Conference
on Information Processing in Medical Imaging. pp. 563–575. Springer (2023)
7. Bronstein, M.M., Bruna, J., LeCun, Y., Szlam, A., Vandergheynst, P.: Geometric
deep learning: going beyond euclidean data. IEEE Signal Processing Magazine
34(4), 18–42 (2017)
8. Cesa, G., Lang, L., Weiler, M.: A program to build E(N)-equivariant steerable
CNNs. In: International Conference on Learning Representations (ICLR) (2022),
https://openreview.net/forum?id=WE4qe9xlnQw
9. Chakraborty, R., Bouza, J., Manton, J.H., Vemuri, B.C.: Manifoldnet: A deep
neural network for manifold-valued data with applications. IEEE Transactions on
Pattern Analysis and Machine Intelligence 44(2), 799–810 (2020)
10. Cheng, G., Salehian, H., Forder, J., Vemuri, B.C.: Tractography from hardi using
an intrinsic unscented kalman filter. IEEE Transactions on Medical Imaging 35(1),
298–305 (2014)
14 G. Cortés, B.C. Vemuri
11. Cheng, G., Vemuri, B.C., Carney, P.R., Mareci, T.H.: Non-rigid registration of
high angular resolution diffusion images represented by gaussian mixture fields.
In: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2009:
12th International Conference, London, UK, September 20-24, 2009, Proceedings,
Part I 12. pp. 190–197. Springer (2009)
12. Christensen, G.E., Rabbitt, R.D., Miller, M.I.: Deformable templates using large
deformation kinematics. IEEE transactions on image processing 5(10), 1435–1447
(1996)
13. Cohen, T., Weiler, M., Kicanaoglu, B., Welling, M.: Gauge equivariant convo-
lutional networks and the icosahedral CNN. In: Proceedings of the 36th Inter-
national Conference on Machine Learning. Proceedings of Machine Learning Re-
search, vol. 97, pp. 1321–1330. PMLR (09–15 Jun 2019)
14. Dalca, A.V., Balakrishnan, G., Guttag, J., Sabuncu, M.R.: Unsupervised learning
of probabilistic diffeomorphic registration for images and surfaces. Medical image
analysis 57, 226–236 (2019)
15. Demir, B., Tian, L., Greer, H., Kwitt, R., Vialard, F.X., Estépar, R.S.J., Bouix,
S., Rushmore, R., Ebrahim, E., Niethammer, M.: Multigradicon: A foundation
model for multimodal medical image registration. In: International Workshop on
Biomedical Image Registration. pp. 3–18. Springer (2024)
16. Descoteaux, M., Deriche, R., Le Bihan, D., Mangin, J.F., Poupon, C.: Multiple q-
shell diffusion propagator imaging. Medical Image Analysis 15(4), 603–621 (2010)
17. Dhollander, T., Raffelt, D., Connelly, A.: Unsupervised 3-tissue response function
estimation from single-shell or multi-shell diffusion mr data without a co-registered
t1 image. In: ISMRM workshop on breaking the barriers of diffusion MRI. vol. 5.
Lisbon, Portugal (2016)
18. Duarte-Carvajalino, J.M., Sapiro, G., Harel, N., Lenglet, C.: A framework for linear
and non-linear registration of diffusion-weighted mris using angular interpolation.
Frontiers in neuroscience 7, 41 (2013)
19. Finzi, M., Welling, M., Wilson, A.G.: A practical method for constructing equivari-
ant multilayer perceptrons for arbitrary matrix groups. In: International conference
on machine learning. pp. 3318–3328. PMLR (2021)
20. Geiger, M., Smidt, T.: e3nn: Euclidean neural networks. arXiv preprint
arXiv:2207.09453 (2022)
21. Hagmann, P., Jonasson, L., Maeder, P., Thiran, J.P., Wedeen, V.J., Meuli, R.: Un-
derstanding diffusion mr imaging techniques: from scalar diffusion-weighted imag-
ing to diffusion tensor imaging and beyond. Radiographics 26(suppl_1), S205–S223
(2006)
22. Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks.
Advances in neural information processing systems 28 (2015)
23. Jeurissen, B., Tournier, J.D., Dhollander, T., Connelly, A., Sijbers, J.: Multi-tissue
constrained spherical deconvolution for improved analysis of multi-shell diffusion
mri data. NeuroImage 103, 411–426 (2014)
24. Jian, B., Vemuri, B.C.: A unified computational framework for deconvolution to
reconstruct multiple fibers from diffusion weighted mri. IEEE Trans. on Med. Imag-
ing 26(11), 1464–1471 (2007)
25. Jian, B., Vemuri, B.C., Özarslan, E., Carney, P.R., Mareci, T.H.: A novel tensor
distribution model for the diffusion-weighted mr signal. Neuroimage 37(1), 164–176
(2007)
26. Liu, R., Lauze, F., et al.: SE(3)Group Conv. NN and a
Study on Group Convs. and Equivariance for DWI Segmen-
A Steerable Deep Network for Model-Free Diffusion MRI Registration 15
tation (2023). https://doi.org/10.21203/rs.3.rs-2531880/v1 ,
https://europepmc.org/article/PPR/PPR624395, preprint
27. Lorenzi, M., Pennec, X.: Geodesics, parallel transport & one-parameter subgroups
for diffeomorphic image registration. International Journal of Computer Vision
105(2), 111–127 (2013)
28. McGraw, T., Vemuri, B.C., Chen, Y., Rao, M., Mareci, T.: Dt-mri denoising and
neuronal fiber tracking. Medical Image Analysis 8(2), 95–111 (2004)
29. Müller, P., Golkov, V., Tomassini, V., Cremers, D.: Rotation-equivariant deep
learning for diffusion mri. arXiv preprint arXiv:2102.06942 (2021)
30. Özarslan, E., Koay, C.G., Shepherd, T.M., Komlosh, M.E., İrfanoğlu, M.O., Pier-
paoli, C., Basser, P.J.: Mean apparent propagator (map) mri: a novel diffusion
imaging method for mapping tissue microstructure. NeuroImage 78, 16–32 (2013)
31. Raffelt, D., Dhollander, T., Tournier, J.D., Tabbara, R., Smith, R.E., Pierre, E.,
Connelly, A.: Bias field correction and intensity normalisation for quantitative anal-
ysis of apparent fibre density. In: Proc. Intl. Soc. Mag. Reson. Med. vol. 25, p. 3541
(2017)
32. Raffelt, D., Tournier, J.D., Crozier, S., Connelly, A., Salvado, O.: Reorientation
of fiber orientation distributions using apodized point spread functions. Magnetic
resonance in medicine 67(3), 844–855 (2012)
33. Raffelt, D., Tournier, J.D., Fripp, J., Crozier, S., Connelly, A., Salvado, O.: Sym-
metric diffeomorphic registration of fibre orientation distributions. Neuroimage
56(3), 1171–1180 (2011)
34. Ramachandran, P., Zoph, B., Le, Q.V.: Searching for activation functions. arXiv
preprint arXiv:1710.05941 (2017)
35. Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomed-
ical image segmentation. In: Medical image computing and computer-assisted
intervention–MICCAI 2015: 18th international conference, Munich, Germany, Oc-
tober 5-9, 2015, proceedings, part III 18. pp. 234–241. Springer (2015)
36. Sriperumbudur, B.K., Gretton, A., Fukumizu, K., Schölkopf, B., Lanckriet, G.R.:
Hilbert space embeddings and metrics on probability measures. The Journal of
Machine Learning Research 11, 1517–1561 (2010)
37. Tao, X., Miller, J.V.: A method for registering diffusion weighted magnetic res-
onance images. In: International Conference on Medical Image Computing and
Computer-Assisted Intervention. pp. 594–602. Springer (2006)
38. Tournier, D.J., Calamante, F., Connelly, A.: Robust determination of the fibre
orientation distribution in diffusion mri: non-negativity constrained super-resolved
spherical deconvolution. Neuroimage 35(4), 1459–1472 (2007)
39. Tournier, J.D., Calamante, F., Gadian, D.G., Connelly, A.: Direct estimation of the
fiber orientation density function from diffusion-weighted mri data using spherical
deconvolution. Neuroimage 23(3), 1176–1185 (2004)
40. Tournier, J.D., Smith, R., Raffelt, D., Tabbara, R., Dhollander, T., Pietsch, M.,
Christiaens, D., Jeurissen, B., Yeh, C.H., Connelly, A.: Mrtrix3: A fast, flexible
and open software framework for medical image processing and visualisation. Neu-
roimage 202, 116137 (2019)
41. Tschumperle, D., Deriche, R.: Variational frameworks for dt-mri estimation, regu-
larization and visualization. In: Proc. of the IEEE International Conf. on Computer
Vision. pp. 116–121 (2003)
42. Van Essen, D.C., Ugurbil, K., Auerbach, E., Barch, D., Behrens, T.E., Bucholz, R.,
Chang, A., Chen, L., Corbetta, M., Curtiss, S.W., et al.: The human connectome
project: a data acquisition perspective. Neuroimage 62(4), 2222–2231 (2012)
16 G. Cortés, B.C. Vemuri
43. Vercauteren, T., Pennec, X., Perchant, A., Ayache, N.: Diffeomorphic demons:
Efficient non-parametric image registration. Neuroimage 45(1), S61–S72 (2009)
44. Wang, F., C, V.B.: Non-rigid multi-modal image registration using cross-
cumulative residual entropy. International Journal of Computer Vision 74, 201–215
(2007)
45. Wang, Z., Vemuri, B.C., Chen, Y., Mareci, T.H.: A constrained variational principle
for direct estimation and smoothing of the diffusion tensor field from complex dwi.
IEEE Transactions on Medical Imaging 23(8), 930–939 (2004)
46. Wasserman, A.G.: Equivariant differential topology. Topology 8(2), 127–150 (1969)
47. Wasserthal, J., Neher, P., Maier-Hein, K.H.: Tractseg-fast and accurate white mat-
ter tract segmentation. NeuroImage 183, 239–253 (2018)
48. Weiler, M.: Equivariant and Coordinate Independent Convolutional Networks. Phd
thesis, University of Amsterdam (2023)
49. Weiler, M., Geiger, M., Welling, M., Boomsma, W., Cohen, T.S.: 3d steerable cnns:
Learning rotationally equivariant features in volumetric data. Advances in Neural
Information Processing Systems 31 (2018)
50. Yang, X., Kwitt, R., Styner, M., Niethammer, M.: Quicksilver: Fast predictive
image registration–a deep learning approach. NeuroImage 158, 378–396 (2017)
51. Zhang, F., Wells, W.M., O’Donnell, L.J.: Deep diffusion mri registration (ddmreg):
a deep learning method for diffusion mri registration. IEEE transactions on medical
imaging 41(6), 1454–1467 (2021)
52. Zhang, H., Yushkevich, P.A., Alexander, D.C., Gee, J.C.: Deformable registration
of diffusion tensor mr images with explicit orientation optimization. Medical image
analysis 10(5), 764–785 (2006)
53. Zhang, P., Niethammer, M., Shen, D., Yap, P.T.: Large deformation diffeomor-
phic registration of diffusion-weighted imaging data. Medical image analysis 18(8),
1290–1298 (2014)
54. Zhdanov, M., Hoffmann, N., Cesa, G.: Implicit convolutional kernels for steerable
cnns. Advances in Neural Information Processing Systems 36 (2024)
55. Özarslan, E., Shepherd, T.M., Vemuri, B.C., Blackband, S.J., Mareci, T.H.: Reso-
lution of complex tissue microarchitecture using the diffusion orientation transform
(dot). Neuroimage (3), 1086–1103 (2005)